Search Results: "xam"

25 January 2024

Joachim Breitner: GHC Steering Committee Retrospective

After seven years of service as member and secretary on the GHC Steering Committee, I have resigned from that role. So this is a good time to look back and retrace the formation of the GHC proposal process and committee. In my memory, I helped define and shape the proposal process, optimizing it for effectiveness and throughput, but memory can be misleading, and judging from the paper trail in my email archives, this was indeed mostly Ben Gamari s and Richard Eisenberg s achievement: Already in Summer of 2016, Ben Gamari set up the ghc-proposals Github repository with a sketch of a process and sent out a call for nominations on the GHC user s mailing list, which I replied to. The Simons picked the first set of members, and in the fall of 2016 we discussed the committee s by-laws and procedures. As so often, Richard was an influential shaping force here.

Three ingredients For example, it was him that suggested that for each proposal we have one committee member be the Shepherd , overseeing the discussion. I believe this was one ingredient for the process effectiveness: There is always one person in charge, and thus we avoid the delays incurred when any one of a non-singleton set of volunteers have to do the next step (and everyone hopes someone else does it). The next ingredient was that we do not usually require a vote among all members (again, not easy with volunteers with limited bandwidth and occasional phases of absence). Instead, the shepherd makes a recommendation (accept/reject), and if the other committee members do not complain, this silence is taken as consent, and we come to a decision. It seems this idea can also be traced back on Richard, who suggested that once a decision is requested, the shepherd [generates] consensus. If consensus is elusive, then we vote. At the end of the year we agreed and wrote down these rules, created the mailing list for our internal, but publicly archived committee discussions, and began accepting proposals, starting with Adam Gundry s OverloadedRecordFields. At that point, there was no secretary role yet, so how I did become one? It seems that in February 2017 I started to clean-up and refine the process documentation, fixing bugs in the process (like requiring authors to set Github labels when they don t even have permissions to do that). This in particular meant that someone from the committee had to manually handle submissions and so on, and by the aforementioned principle that at every step there ought to be exactly one person in change, the role of a secretary followed naturally. In the email in which I described that role I wrote:
Simon already shoved me towards picking up the secretary hat, to reduce load on Ben.
So when I merged the updated process documentation, I already listed myself secretary . It wasn t just Simon s shoving that put my into the role, though. I dug out my original self-nomination email to Ben, and among other things I wrote:
I also hope that there is going to be clear responsibilities and a clear workflow among the committee. E.g. someone (possibly rotating), maybe called the secretary, who is in charge of having an initial look at proposals and then assigning it to a member who shepherds the proposal.
So it is hardly a surprise that I became secretary, when it was dear to my heart to have a smooth continuous process here. I am rather content with the result: These three ingredients single secretary, per-proposal shepherds, silence-is-consent helped the committee to be effective throughout its existence, even as every once in a while individual members dropped out.

Ulterior motivation I must admit, however, there was an ulterior motivation behind me grabbing the secretary role: Yes, I did want the committee to succeed, and I did want that authors receive timely, good and decisive feedback on their proposals but I did not really want to have to do that part. I am, in fact, a lousy proposal reviewer. I am too generous when reading proposals, and more likely mentally fill gaps in a specification rather than spotting them. Always optimistically assuming that the authors surely know what they are doing, rather than critically assessing the impact, the implementation cost and the interaction with other language features. And, maybe more importantly: why should I know which changes are good and which are not so good in the long run? Clearly, the authors cared enough about a proposal to put it forward, so there is some need and I do believe that Haskell should stay an evolving and innovating language but how does this help me decide about this or that particular feature. I even, during the formation of the committee, explicitly asked that we write down some guidance on Vision and Guideline ; do we want to foster change or innovation, or be selective gatekeepers? Should we accept features that are proven to be useful, or should we accept features so that they can prove to be useful? This discussion, however, did not lead to a concrete result, and the assessment of proposals relied on the sum of each member s personal preference, expertise and gut feeling. I am not saying that this was a mistake: It is hard to come up with a general guideline here, and even harder to find one that does justice to each individual proposal. So the secret motivation for me to grab the secretary post was that I could contribute without having to judge proposals. Being secretary allowed me to assign most proposals to others to shepherd, and only once in a while myself took care of a proposal, when it seemed to be very straight-forward. Sneaky, ain t it?

7 Years later For years to come I happily played secretary: When an author finished their proposal and public discussion ebbed down they would ping me on GitHub, I would pick a suitable shepherd among the committee and ask them to judge the proposal. Eventually, the committee would come to a conclusion, usually by implicit consent, sometimes by voting, and I d merge the pull request and update the metadata thereon. Every few months I d summarize the current state of affairs to the committee (what happened since the last update, which proposals are currently on our plate), and once per year gathered the data for Simon Peyton Jones annually GHC Status Report. Sometimes some members needed a nudge or two to act. Some would eventually step down, and I d sent around a call for nominations and when the nominations came in, distributed them off-list among the committee and tallied the votes. Initially, that was exciting. For a long while it was a pleasant and rewarding routine. Eventually, it became a mere chore. I noticed that I didn t quite care so much anymore about some of the discussion, and there was a decent amount of naval-gazing, meta-discussions and some wrangling about claims of authority that was probably useful and necessary, but wasn t particularly fun. I also began to notice weaknesses in the processes that I helped shape: We could really use some more automation for showing proposal statuses, notifying people when they have to act, and nudging them when they don t. The whole silence-is-assent approach is good for throughput, but not necessary great for quality, and maybe the committee members need to be pushed more firmly to engage with each proposal. Like GHC itself, the committee processes deserve continuous refinement and refactoring, and since I could not muster the motivation to change my now well-trod secretarial ways, it was time for me to step down. Luckily, Adam Gundry volunteered to take over, and that makes me feel much less bad for quitting. Thanks for that! And although I am for my day job now enjoying a language that has many of the things out of the box that for Haskell are still only language extensions or even just future proposals (dependent types, BlockArguments, do notation with ( foo) expressions and Unicode), I m still around, hosting the Haskell Interlude Podcast, writing on this blog and hanging out at ZuriHac etc.

24 January 2024

Thomas Lange: FAI 6.2 released

After more than one a year, a new minor FAI version is available, but it includes some interesting new features. Here a the items from the NEWS file: fai (6.2) unstable; urgency=low In the past the command fai-cd was only used for creating installation ISOs, that could be used from CD or USB stick. Now it possible to create a live ISO. Therefore you create your live chroot environment using 'fai dirinstall' and then convert it to a bootable live ISO using fai-cd. See man fai-cd(8) for an example. Years ago I had the idea to use the remaining disk space on an USB stick after copying an ISO onto it. I've blogged about this recently: https://blog.fai-project.org/posts/extending-iso-images/ The new FAI version includes the tool mk-data-partition for adding a data partition to the ISO itself or to an USB stick. FAI detects this data partition, mounts it to /media/data and can then use various configurations from it. You may want to copy your own set of .deb packages or your whole FAI config space to this partition. FAI now automatically searches this partition for usable FAI configuration data and packages. FAI will install all packages from pkgs/<CLASSNAME> if the equivalent class is defined. Setting FAI_CONFIG_SRC=detect:// now looks into the data partition for the subdirectory 'config' and uses this as the config space. So it's now possible to modify an existing ISO (that is read-only) and make changes to the config space. If there's no config directory in the data partition FAI uses the default location on the ISO. The tool fai-kvm, which starts virtual machines can now boot an ISO not only as CD but also as USB stick. Sometimes users want to adjust the list of disks before the partitioning is startet. Therefore FAI provides several new functions including You can select individual disks by their model name or even the serial number. Two new FAI flags were added (tmux and screen) that make it easy to run FAI inside a tmux or screen session. And finally FAI uses systemd. Yeah! This technical change was waiting since 2015 in a merge request from Moritz 'Morty' Str be, that would enable using systemd during the installation. Before FAI still was using old-style SYSV init scripts and did not started systemd. I didn't tried to apply the patch, because I was afraid that it would need much time to make it work. But then in may 2023 Juri Grabowski just gave it a try at MiniDebConf Hamburg, and voil it just works! Many, many thanks to Moritz and Juri for their bravery. The whole changelog can be found at https://tracker.debian.org/media/packages/f/fai/changelog-6.2 New ISOs for FAI are also available including an example of a Xfce desktop live ISO: https://fai-project.org/fai-cd/ The FAIme service for creating customized installation ISOs will get its update later. The new packages are available for bookworm by adding this line to your sources.list: deb https://fai-project.org/download bookworm koeln

Dirk Eddelbuettel: RcppAnnoy 0.0.22 on CRAN: Maintenance

annoy image A very minor maintenance release, now at version 0.0.22, of RcppAnnoy has arrived on CRAN. RcppAnnoy is the Rcpp-based R integration of the nifty Annoy library by Erik Bernhardsson. Annoy is a small and lightweight C++ template header library for very fast approximate nearest neighbours originally developed to drive the Spotify music discovery algorithm. It had all the buzzwords already a decade ago: it is one of the algorithms behind (drum roll ) vector search as it finds approximate matches very quickly and also allows to persist the data. This release responds to a CRAN request to clean up empty macros and sections in Rd files. Details of the release follow based on the NEWS file.

Changes in version 0.0.22 (2024-01-23)
  • Replace empty examples macro to satisfy CRAN request.

Courtesy of my CRANberries, there is also a diffstat report for this release. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

20 January 2024

Niels Thykier: Making debputy: Writing declarative parsing logic

In this blog post, I will cover how debputy parses its manifest and the conceptual improvements I did to make parsing of the manifest easier. All instructions to debputy are provided via the debian/debputy.manifest file and said manifest is written in the YAML format. After the YAML parser has read the basic file structure, debputy does another pass over the data to extract the information from the basic structure. As an example, the following YAML file:
manifest-version: "0.1"
installations:
  - install:
      source: foo
      dest-dir: usr/bin
would be transformed by the YAML parser into a structure resembling:
 
  "manifest-version": "0.1",
  "installations": [
      
       "install":  
         "source": "foo",
         "dest-dir": "usr/bin",
        
      
  ]
 
This structure is then what debputy does a pass on to translate this into an even higher level format where the "install" part is translated into an InstallRule. In the original prototype of debputy, I would hand-write functions to extract the data that should be transformed into the internal in-memory high level format. However, it was quite tedious. Especially because I wanted to catch every possible error condition and report "You are missing the required field X at Y" rather than the opaque KeyError: X message that would have been the default. Beyond being tedious, it was also quite error prone. As an example, in debputy/0.1.4 I added support for the install rule and you should allegedly have been able to add a dest-dir: or an as: inside it. Except I crewed up the code and debputy was attempting to look up these keywords from a dict that could never have them. Hand-writing these parsers were so annoying that it demotivated me from making manifest related changes to debputy simply because I did not want to code the parsing logic. When I got this realization, I figured I had to solve this problem better. While reflecting on this, I also considered that I eventually wanted plugins to be able to add vocabulary to the manifest. If the API was "provide a callback to extract the details of whatever the user provided here", then the result would be bad.
  1. Most plugins would probably throw KeyError: X or ValueError style errors for quite a while. Worst case, they would end on my table because the user would have a hard time telling where debputy ends and where the plugins starts. "Best" case, I would teach debputy to say "This poor error message was brought to you by plugin foo. Go complain to them". Either way, it would be a bad user experience.
  2. This even assumes plugin providers would actually bother writing manifest parsing code. If it is that difficult, then just providing a custom file in debian might tempt plugin providers and that would undermine the idea of having the manifest be the sole input for debputy.
So beyond me being unsatisfied with the current situation, it was also clear to me that I needed to come up with a better solution if I wanted externally provided plugins for debputy. To put a bit more perspective on what I expected from the end result:
  1. It had to cover as many parsing errors as possible. An error case this code would handle for you, would be an error where I could ensure it sufficient degree of detail and context for the user.
  2. It should be type-safe / provide typing support such that IDEs/mypy could help you when you work on the parsed result.
  3. It had to support "normalization" of the input, such as
           # User provides
           - install: "foo"
           # Which is normalized into:
           - install:
               source: "foo"
4) It must be simple to tell  debputy  what input you expected.
At this point, I remembered that I had seen a Python (PYPI) package where you could give it a TypedDict and an arbitrary input (Sadly, I do not remember the name). The package would then validate the said input against the TypedDict. If the match was successful, you would get the result back casted as the TypedDict. If the match was unsuccessful, the code would raise an error for you. Conceptually, this seemed to be a good starting point for where I wanted to be. Then I looked a bit on the normalization requirement (point 3). What is really going on here is that you have two "schemas" for the input. One is what the programmer will see (the normalized form) and the other is what the user can input (the manifest form). The problem is providing an automatic normalization from the user input to the simplified programmer structure. To expand a bit on the following example:
# User provides
- install: "foo"
# Which is normalized into:
- install:
    source: "foo"
Given that install has the attributes source, sources, dest-dir, as, into, and when, how exactly would you automatically normalize "foo" (str) into source: "foo"? Even if the code filtered by "type" for these attributes, you would end up with at least source, dest-dir, and as as candidates. Turns out that TypedDict actually got this covered. But the Python package was not going in this direction, so I parked it here and started looking into doing my own. At this point, I had a general idea of what I wanted. When defining an extension to the manifest, the plugin would provide debputy with one or two definitions of TypedDict. The first one would be the "parsed" or "target" format, which would be the normalized form that plugin provider wanted to work on. For this example, lets look at an earlier version of the install-examples rule:
# Example input matching this typed dict.
#    
#       "source": ["foo"]
#       "into": ["pkg"]
#    
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
In this form, the install-examples has two attributes - both are list of strings. On the flip side, what the user can input would look something like this:
# Example input matching this typed dict.
#    
#       "source": "foo"
#       "into": "pkg"
#    
#
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
FullInstallExamplesManifestFormat = Union[
    InstallExamplesManifestFormat,
    List[str],
    str,
]
The idea was that the plugin provider would use these two definitions to tell debputy how to parse install-examples. Pseudo-registration code could look something like:
def _handler(
    normalized_form: InstallExamplesTargetFormat,
) -> InstallRule:
    ...  # Do something with the normalized form and return an InstallRule.
concept_debputy_api.add_install_rule(
  keyword="install-examples",
  target_form=InstallExamplesTargetFormat,
  manifest_form=FullInstallExamplesManifestFormat,
  handler=_handler,
)
This was my conceptual target and while the current actual API ended up being slightly different, the core concept remains the same.
From concept to basic implementation Building this code is kind like swallowing an elephant. There was no way I would just sit down and write it from one end to the other. So the first prototype of this did not have all the features it has now. Spoiler warning, these next couple of sections will contain some Python typing details. When reading this, it might be helpful to know things such as Union[str, List[str]] being the Python type for either a str (string) or a List[str] (list of strings). If typing makes your head spin, these sections might less interesting for you. To build this required a lot of playing around with Python's introspection and typing APIs. My very first draft only had one "schema" (the normalized form) and had the following features:
  • Read TypedDict.__required_attributes__ and TypedDict.__optional_attributes__ to determine which attributes where present and which were required. This was used for reporting errors when the input did not match.
  • Read the types of the provided TypedDict, strip the Required / NotRequired markers and use basic isinstance checks based on the resulting type for str and List[str]. Again, used for reporting errors when the input did not match.
This prototype did not take a long (I remember it being within a day) and worked surprisingly well though with some poor error messages here and there. Now came the first challenge, adding the manifest format schema plus relevant normalization rules. The very first normalization I did was transforming into: Union[str, List[str]] into into: List[str]. At that time, source was not a separate attribute. Instead, sources was a Union[str, List[str]], so it was the only normalization I needed for all my use-cases at the time. There are two problems when writing a normalization. First is determining what the "source" type is, what the target type is and how they relate. The second is providing a runtime rule for normalizing from the manifest format into the target format. Keeping it simple, the runtime normalizer for Union[str, List[str]] -> List[str] was written as:
def normalize_into_list(x: Union[str, List[str]]) -> List[str]:
    return x if isinstance(x, list) else [x]
This basic form basically works for all types (assuming none of the types will have List[List[...]]). The logic for determining when this rule is applicable is slightly more involved. My current code is about 100 lines of Python code that would probably lose most of the casual readers. For the interested, you are looking for _union_narrowing in declarative_parser.py With this, when the manifest format had Union[str, List[str]] and the target format had List[str] the generated parser would silently map a string into a list of strings for the plugin provider. But with that in place, I had covered the basics of what I needed to get started. I was quite excited about this milestone of having my first keyword parsed without handwriting the parser logic (at the expense of writing a more generic parse-generator framework).
Adding the first parse hint With the basic implementation done, I looked at what to do next. As mentioned, at the time sources in the manifest format was Union[str, List[str]] and I considered to split into a source: str and a sources: List[str] on the manifest side while keeping the normalized form as sources: List[str]. I ended up committing to this change and that meant I had to solve the problem getting my parser generator to understand the situation:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[str]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
There are two related problems to solve here:
  1. How will the parser generator understand that source should be normalized and then mapped into sources?
  2. Once that is solved, the parser generator has to understand that while source and sources are declared as NotRequired, they are part of a exactly one of rule (since sources in the target form is Required). This mainly came down to extra book keeping and an extra layer of validation once the previous step is solved.
While working on all of this type introspection for Python, I had noted the Annotated[X, ...] type. It is basically a fake type that enables you to attach metadata into the type system. A very random example:
# For all intents and purposes,  foo  is a string despite all the  Annotated  stuff.
foo: Annotated[str, "hello world"] = "my string here"
The exciting thing is that you can put arbitrary details into the type field and read it out again in your introspection code. Which meant, I could add "parse hints" into the type. Some "quick" prototyping later (a day or so), I got the following to work:
# Map from
#      
#        "source": "foo"  # (or "sources": ["foo"])
#        "into": "pkg"
#      
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[str, List[str]]
# ... into
#      
#        "source": ["foo"]
#        "into": ["pkg"]
#      
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[List[str]]
Without me (as a plugin provider) writing a line of code, I can have debputy rename or "merge" attributes from the manifest form into the normalized form. Obviously, this required me (as the debputy maintainer) to write a lot code so other me and future plugin providers did not have to write it.
High level typing At this point, basic normalization between one mapping to another mapping form worked. But one thing irked me with these install rules. The into was a list of strings when the parser handed them over to me. However, I needed to map them to the actual BinaryPackage (for technical reasons). While I felt I was careful with my manual mapping, I knew this was exactly the kind of case where a busy programmer would skip the "is this a known package name" check and some user would typo their package resulting in an opaque KeyError: foo. Side note: "Some user" was me today and I was super glad to see debputy tell me that I had typoed a package name (I would have been more happy if I had remembered to use debputy check-manifest, so I did not have to wait through the upstream part of the build that happened before debhelper passed control to debputy...) I thought adding this feature would be simple enough. It basically needs two things:
  1. Conversion table where the parser generator can tell that BinaryPackage requires an input of str and a callback to map from str to BinaryPackage. (That is probably lie. I think the conversion table came later, but honestly I do remember and I am not digging into the git history for this one)
  2. At runtime, said callback needed access to the list of known packages, so it could resolve the provided string.
It was not super difficult given the existing infrastructure, but it did take some hours of coding and debugging. Additionally, I added a parse hint to support making the into conditional based on whether it was a single binary package. With this done, you could now write something like:
# Map from
class InstallExamplesManifestFormat(TypedDict):
    # Note that sources here is split into source (str) vs. sources (List[str])
    sources: NotRequired[List[str]]
    source: NotRequired[
        Annotated[
            str,
            DebputyParseHint.target_attribute("sources")
        ]
    ]
    # We allow the user to write  into: foo  in addition to  into: [foo] 
    into: Union[BinaryPackage, List[BinaryPackage]]
# ... into
class InstallExamplesTargetFormat(TypedDict):
    # Which source files to install (dest-dir is fixed)
    sources: List[str]
    # Which package(s) that should have these files installed.
    into: NotRequired[
        Annotated[
            List[BinaryPackage],
            DebputyParseHint.required_when_multi_binary()
        ]
    ]
Code-wise, I still had to check for into being absent and providing a default for that case (that is still true in the current codebase - I will hopefully fix that eventually). But I now had less room for mistakes and a standardized error message when you misspell the package name, which was a plus.
The added side-effect - Introspection A lovely side-effect of all the parsing logic being provided to debputy in a declarative form was that the generated parser snippets had fields containing all expected attributes with their types, which attributes were required, etc. This meant that adding an introspection feature where you can ask debputy "What does an install rule look like?" was quite easy. The code base already knew all of this, so the "hard" part was resolving the input the to concrete rule and then rendering it to the user. I added this feature recently along with the ability to provide online documentation for parser rules. I covered that in more details in my blog post Providing online reference documentation for debputy in case you are interested. :)
Wrapping it up This was a short insight into how debputy parses your input. With this declarative technique:
  • The parser engine handles most of the error reporting meaning users get most of the errors in a standard format without the plugin provider having to spend any effort on it. There will be some effort in more complex cases. But the common cases are done for you.
  • It is easy to provide flexibility to users while avoiding having to write code to normalize the user input into a simplified programmer oriented format.
  • The parser handles mapping from basic types into higher forms for you. These days, we have high level types like FileSystemMode (either an octal or a symbolic mode), different kind of file system matches depending on whether globs should be performed, etc. These types includes their own validation and parsing rules that debputy handles for you.
  • Introspection and support for providing online reference documentation. Also, debputy checks that the provided attribute documentation covers all the attributes in the manifest form. If you add a new attribute, debputy will remind you if you forget to document it as well. :)
In this way everybody wins. Yes, writing this parser generator code was more enjoyable than writing the ad-hoc manual parsers it replaced. :)

17 January 2024

Colin Watson: Task management

Now that I m freelancing, I need to actually track my time, which is something I ve had the luxury of not having to do before. That meant something of a rethink of the way I ve been keeping track of my to-do list. Up to now that was a combination of things like the bug lists for the projects I m working on at the moment, whatever task tracking system Canonical was using at the moment (Jira when I left), and a giant flat text file in which I recorded logbook-style notes of what I d done each day plus a few extra notes at the bottom to remind myself of particularly urgent tasks. I could have started manually adding times to each logbook entry, but ugh, let s not. In general, I had the following goals (which were a bit reminiscent of my address book): I didn t do an elaborate evaluation of multiple options, because I m not trying to come up with the best possible solution for a client here. Also, there are a bazillion to-do list trackers out there and if I tried to evaluate them all I d never do anything else. I just wanted something that works well enough for me. Since it came up on Mastodon: a bunch of people swear by Org mode, which I know can do at least some of this sort of thing. However, I don t use Emacs and don t plan to use Emacs. nvim-orgmode does have some support for time tracking, but when I ve tried vim-based versions of Org mode in the past I ve found they haven t really fitted my brain very well. Taskwarrior and Timewarrior One of the other Freexian collaborators mentioned Taskwarrior and Timewarrior, so I had a look at those. The basic idea of Taskwarrior is that you have a task command that tracks each task as a blob of JSON and provides subcommands to let you add, modify, and remove tasks with a minimum of friction. task add adds a task, and you can add metadata like project:Personal (I always make sure every task has a project, for ease of filtering). Just running task shows you a task list sorted by Taskwarrior s idea of urgency, with an ID for each task, and there are various other reports with different filtering and verbosity. task <id> annotate lets you attach more information to a task. task <id> done marks it as done. So far so good, so a redacted version of my to-do list looks like this:
$ task ls
ID A Project     Tags                 Description
17   Freexian                         Add Incus support to autopkgtest [2]
 7   Columbiform                      Figure out Lloyds online banking [1]
 2   Debian                           Fix troffcvt for groff 1.23.0 [1]
11   Personal                         Replace living room curtain rail
Once I got comfortable with it, this was already a big improvement. I haven t bothered to learn all the filtering gadgets yet, but it was easy enough to see that I could do something like task all project:Personal and it d show me both pending and completed tasks in that project, and that all the data was stored in ~/.task - though I have to say that there are enough reporting bells and whistles that I haven t needed to poke around manually. In combination with the regular backups that I do anyway (you do too, right?), this gave me enough confidence to abandon my previous text-file logbook approach. Next was time tracking. Timewarrior integrates with Taskwarrior, albeit in an only semi-packaged way, and it was easy enough to set that up. Now I can do:
$ task 25 start
Starting task 00a9516f 'Write blog post about task tracking'.
Started 1 task.
Note: '"Write blog post about task tracking"' is a new tag.
Tracking Columbiform "Write blog post about task tracking"
  Started 2024-01-10T11:28:38
  Current                  38
  Total               0:00:00
You have more urgent tasks.
Project 'Columbiform' is 25% complete (3 of 4 tasks remaining).
When I stop work on something, I do task active to find the ID, then task <id> stop. Timewarrior does the tedious stopwatch business for me, and I can manually enter times if I forget to start/stop a task. Then the really useful bit: I can do something like timew summary :month <name-of-client> and it tells me how much to bill that client for this month. Perfect. I also started using VIT to simplify the day-to-day flow a little, which means I m normally just using one or two keystrokes rather than typing longer commands. That isn t really necessary from my point of view, but it does save some time. Android integration I left Android integration for a bit later since it wasn t essential. When I got round to it, I have to say that it felt a bit clumsy, but it did eventually work. The first step was to set up a taskserver. Most of the setup procedure was OK, but I wanted to use Let s Encrypt to minimize the amount of messing around with CAs I had to do. Getting this to work involved hitting things with sticks a bit, and there s still a local CA involved for client certificates. What I ended up with was a certbot setup with the webroot authenticator and a custom deploy hook as follows (with cert_name replaced by a DNS name in my house domain):
#! /bin/sh
set -eu
cert_name=taskd.example.org
found=false
for domain in $RENEWED_DOMAINS; do
    case "$domain" in
        $cert_name)
            found=:
            ;;
    esac
done
$found   exit 0
install -m 644 "/etc/letsencrypt/live/$cert_name/fullchain.pem" \
    /var/lib/taskd/pki/fullchain.pem
install -m 640 -g Debian-taskd "/etc/letsencrypt/live/$cert_name/privkey.pem" \
    /var/lib/taskd/pki/privkey.pem
systemctl restart taskd.service
I could then set this in /etc/taskd/config (server.crl.pem and ca.cert.pem were generated using the documented taskserver setup procedure):
server.key=/var/lib/taskd/pki/privkey.pem
server.cert=/var/lib/taskd/pki/fullchain.pem
server.crl=/var/lib/taskd/pki/server.crl.pem
ca.cert=/var/lib/taskd/pki/ca.cert.pem
Then I could set taskd.ca on my laptop to /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt and otherwise follow the client setup instructions, run task sync init to get things started, and then task sync every so often to sync changes between my laptop and the taskserver. I used TaskWarrior Mobile as the client. I have to say I wouldn t want to use that client as my primary task tracking interface: the setup procedure is clunky even beyond the necessity of copying a client certificate around, it expects you to give it a .taskrc rather than having a proper settings interface for that, and it only seems to let you add a task if you specify a due date for it. It also lacks Timewarrior integration, so I can only really use it when I don t care about time tracking, e.g. personal tasks. But that s really all I need, so it meets my minimum requirements. Next? Considering this is literally the first thing I tried, I have to say I m pretty happy with it. There are a bunch of optional extras I haven t tried yet, but in general it kind of has the vim nature for me: if I need something it s very likely to exist or easy enough to build, but the features I don t use don t get in my way. I wouldn t recommend any of this to somebody who didn t already spend most of their time in a terminal - but I do. I m glad people have gone to all the effort to build this so I didn t have to.

16 January 2024

Thomas Koch: Missing memegen

Posted on May 1, 2022
Back at $COMPANY we had an internal meme-site. I had some reputation in my team for creating good memes. When I watched Episode 3 of Season 2 from Yes Premier Minister yesterday, I really missed a place to post memes. This is the full scene. Please watch it or even the full episode before scrolling down to the GIFs. I had a good laugh for some time. With Debian, I could just download the episode from somewhere on the net with youtube-dl and easily create two GIFs using ffmpeg, with and without subtitle:
ffmpeg  -ss 0:5:59.600 -to 0:6:11.150 -i Downloads/Yes.Prime.Minister.S02E03-1254485068289.mp4 tmp/tragic.gif
ffmpeg  -ss 0:5:59.600 -to 0:6:11.150 -i Downloads/Yes.Prime.Minister.S02E03-1254485068289.mp4 \
        -vf "subtitles=tmp/sub.srt:force_style='Fontsize=60'" tmp/tragic_with_subtitle.gif
And this sub.srt file:
1
00:00:10,000 --> 00:00:12,000
Tragic.
I believe, one needs to install the libavfilter-extra variant to burn the subtitle in the GIF. Some space to hide the GIFs. The Premier Minister just learned, that his predecessor, who was about to publish embarassing memories, died of a sudden heart attack: I can t actually think of a meme with this GIF, that the internal thought police community moderation would not immediately take down. For a moment I thought that it would be fun to have a Meme-Site for Debian members. But it is probably not the right time for this. Maybe somebody likes the above GIFs though and wants to use them somewhere.

11 January 2024

Matthias Klumpp: Wayland really breaks things Just for now?

This post is in part a response to an aspect of Nate s post Does Wayland really break everything? , but also my reflection on discussing Wayland protocol additions, a unique pleasure that I have been involved with for the past months1.

Some facts Before I start I want to make a few things clear: The Linux desktop will be moving to Wayland2 this is a fact at this point (and has been for a while), sticking to X11 makes no sense for future projects. From reading Wayland protocols and working with it at a much lower level than I ever wanted to, it is also very clear to me that Wayland is an exceptionally well-designed core protocol, and so are the additional extension protocols (xdg-shell & Co.). The modularity of Wayland is great, it gives it incredible flexibility and will for sure turn out to be good for the long-term viability of this project (and also provides a path to correct protocol issues in future, if one is found). In other words: Wayland is an amazing foundation to build on, and a lot of its design decisions make a lot of sense! The shift towards people seeing Linux more as an application developer platform, and taking PipeWire and XDG Portals into account when designing for Wayland is also an amazing development and I love to see this this holistic approach is something I always wanted! Furthermore, I think Wayland removes a lot of functionality that shouldn t exist in a modern compositor and that s a good thing too! Some of X11 s features and design decisions had clear drawbacks that we shouldn t replicate. I highly recommend to read Nate s blog post, it s very good and goes into more detail. And due to all of this, I firmly believe that any advancement in the Wayland space must come from within the project.

But! But! Of course there was a but coming  I think while developing Wayland-as-an-ecosystem we are now entrenched into narrow concepts of how a desktop should work. While discussing Wayland protocol additions, a lot of concepts clash, people from different desktops with different design philosophies debate the merits of those over and over again never reaching any conclusion (just as you will never get an answer out of humans whether sushi or pizza is the clearly superior food, or whether CSD or SSD is better). Some people want to use Wayland as a vehicle to force applications to submit to their desktop s design philosophies, others prefer the smallest and leanest protocol possible, other developers want the most elegant behavior possible. To be clear, I think those are all very valid approaches. But this also creates problems: By switching to Wayland compositors, we are already forcing a lot of porting work onto toolkit developers and application developers. This is annoying, but just work that has to be done. It becomes frustrating though if Wayland provides toolkits with absolutely no way to reach their goal in any reasonable way. For Nate s Photoshop analogy: Of course Linux does not break Photoshop, it is Adobe s responsibility to port it. But what if Linux was missing a crucial syscall that Photoshop needed for proper functionality and Adobe couldn t port it without that? In that case it becomes much less clear on who is to blame for Photoshop not being available. A lot of Wayland protocol work is focused on the environment and design, while applications and work to port them often is considered less. I think this happens because the overlap between application developers and developers of the desktop environments is not necessarily large, and the overlap with people willing to engage with Wayland upstream is even smaller. The combination of Windows developers porting apps to Linux and having involvement with toolkits or Wayland is pretty much nonexistent. So they have less of a voice.

A quick detour through the neuroscience research lab I have been involved with Freedesktop, GNOME and KDE for an incredibly long time now (more than a decade), but my actual job (besides consulting for Purism) is that of a PhD candidate in a neuroscience research lab (working on the morphology of biological neurons and its relation to behavior). I am mostly involved with three research groups in our institute, which is about 35 people. Most of us do all our data analysis on powerful servers which we connect to using RDP (with KDE Plasma as desktop). Since I joined, I have been pushing the envelope a bit to extend Linux usage to data acquisition and regular clients, and to have our data acquisition hardware interface well with it. Linux brings some unique advantages for use in research, besides the obvious one of having every step of your data management platform introspectable with no black boxes left, a goal I value very highly in research (but this would be its own blogpost). In terms of operating system usage though, most systems are still Windows-based. Windows is what companies develop for, and what people use by default and are familiar with. The choice of operating system is very strongly driven by application availability, and WSL being really good makes this somewhat worse, as it removes the need for people to switch to a real Linux system entirely if there is the occasional software requiring it. Yet, we have a lot more Linux users than before, and use it in many places where it makes sense. I also developed a novel data acquisition software that even runs on Linux-only and uses the abilities of the platform to its fullest extent. All of this resulted in me asking existing software and hardware vendors for Linux support a lot more often. Vendor-customer relationship in science is usually pretty good, and vendors do usually want to help out. Same for open source projects, especially if you offer to do Linux porting work for them But overall, the ease of use and availability of required applications and their usability rules supreme. Most people are not technically knowledgeable and just want to get their research done in the best way possible, getting the best results with the least amount of friction.
KDE/Linux usage at a control station for a particle accelerator at Adlershof Technology Park, Germany, for reference (by 25years of KDE)3

Back to the point The point of that story is this: GNOME, KDE, RHEL, Debian or Ubuntu: They all do not matter if the necessary applications are not available for them. And as soon as they are, the easiest-to-use solution wins. There are many facets of easiest : In many cases this is RHEL due to Red Hat support contracts being available, in many other cases it is Ubuntu due to its mindshare and ease of use. KDE Plasma is also frequently seen, as it is perceived a bit easier to onboard Windows users with it (among other benefits). Ultimately, it comes down to applications and 3rd-party support though. Here s a dirty secret: In many cases, porting an application to Linux is not that difficult. The thing that companies (and FLOSS projects too!) struggle with and will calculate the merits of carefully in advance is whether it is worth the support cost as well as continuous QA/testing. Their staff will have to do all of that work, and they could spend that time on other tasks after all. So if they learn that porting to Linux not only means added testing and support, but also means to choose between the legacy X11 display server that allows for 1:1 porting from Windows or the new Wayland compositors that do not support the same features they need, they will quickly consider it not worth the effort at all. I have seen this happen. Of course many apps use a cross-platform toolkit like Qt, which greatly simplifies porting. But this just moves the issue one layer down, as now the toolkit needs to abstract Windows, macOS and Wayland. And Wayland does not contain features to do certain things or does them very differently from e.g. Windows, so toolkits have no way to actually implement the existing functionality in a way that works on all platforms. So in Qt s documentation you will often find texts like works everywhere except for on Wayland compositors or mobile 4. Many missing bits or altered behavior are just papercuts, but those add up. And if users will have a worse experience, this will translate to more support work, or people not wanting to use the software on the respective platform.

What s missing?

Window positioning SDI applications with multiple windows are very popular in the scientific world. For data acquisition (for example with microscopes) we often have one monitor with control elements and one larger one with the recorded image. There is also other configurations where multiple signal modalities are acquired, and the experimenter aligns windows exactly in the way they want and expects the layout to be stored and to be loaded upon reopening the application. Even in the image from Adlershof Technology Park above you can see this style of UI design, at mega-scale. Being able to pop-out elements as windows from a single-window application to move them around freely is another frequently used paradigm, and immensely useful with these complex apps. It is important to note that this is not a legacy design, but in many cases an intentional choice these kinds of apps work incredibly well on larger screens or many screens and are very flexible (you can have any window configuration you want, and switch between them using the (usually) great window management abilities of your desktop). Of course, these apps will work terribly on tablets and small form factors, but that is not the purpose they were designed for and nobody would use them that way. I assumed for sure these features would be implemented at some point, but when it became clear that that would not happen, I created the ext-placement protocol which had some good discussion but was ultimately rejected from the xdg namespace. I then tried another solution based on feedback, which turned out not to work for most apps, and now proposed xdg-placement (v2) in an attempt to maybe still get some protocol done that we can agree on, exploring more options before pushing the existing protocol for inclusion into the ext Wayland protocol namespace. Meanwhile though, we can not port any application that needs this feature, while at the same time we are switching desktops and distributions to Wayland by default.

Window position restoration Similarly, a protocol to save & restore window positions was already proposed in 2018, 6 years ago now, but it has still not been agreed upon, and may not even help multiwindow apps in its current form. The absence of this protocol means that applications can not restore their former window positions, and the user has to move them to their previous place again and again. Meanwhile, toolkits can not adopt these protocols and applications can not use them and can not be ported to Wayland without introducing papercuts.

Window icons Similarly, individual windows can not set their own icons, and not-installed applications can not have an icon at all because there is no desktop-entry file to load the icon from and no icon in the theme for them. You would think this is a niche issue, but for applications that create many windows, providing icons for them so the user can find them is fairly important. Of course it s not the end of the world if every window has the same icon, but it s one of those papercuts that make the software slightly less user-friendly. Even applications with fewer windows like LibrePCB are affected, so much so that they rather run their app through Xwayland for now. I decided to address this after I was working on data analysis of image data in a Python virtualenv, where my code and the Python libraries used created lots of windows all with the default yellow W icon, making it impossible to distinguish them at a glance. This is xdg-toplevel-icon now, but of course it is an uphill battle where the very premise of needing this is questioned. So applications can not use it yet.

Limited window abilities requiring specialized protocols Firefox has a picture-in-picture feature, allowing it to pop out media from a mediaplayer as separate floating window so the user can watch the media while doing other things. On X11 this is easily realized, but on Wayland the restrictions posed on windows necessitate a different solution. The xdg-pip protocol was proposed for this specialized usecase, but it is also not merged yet. So this feature does not work as well on Wayland.

Automated GUI testing / accessibility / automation Automation of GUI tasks is a powerful feature, so is the ability to auto-test GUIs. This is being worked on, with libei and wlheadless-run (and stuff like ydotool exists too), but we re not fully there yet.

Wayland is frustrating for (some) application authors As you see, there is valid applications and valid usecases that can not be ported yet to Wayland with the same feature range they enjoyed on X11, Windows or macOS. So, from an application author s perspective, Wayland does break things quite significantly, because things that worked before can no longer work and Wayland (the whole stack) does not provide any avenue to achieve the same result. Wayland does break screen sharing, global hotkeys, gaming latency (via no tearing ) etc, however for all of these there are solutions available that application authors can port to. And most developers will gladly do that work, especially since the newer APIs are usually a lot better and more robust. But if you give application authors no path forward except use Xwayland and be on emulation as second-class citizen forever , it just results in very frustrated application developers. For some application developers, switching to a Wayland compositor is like buying a canvas from the Linux shop that forces your brush to only draw triangles. But maybe for your avant-garde art, you need to draw a circle. You can approximate one with triangles, but it will never be as good as the artwork of your friends who got their canvases from the Windows or macOS art supply shop and have more freedom to create their art.

Triangles are proven to be the best shape! If you are drawing circles you are creating bad art! Wayland, via its protocol limitations, forces a certain way to build application UX often for the better, but also sometimes to the detriment of users and applications. The protocols are often fairly opinionated, a result of the lessons learned from X11. In any case though, it is the odd one out Windows and macOS do not pose the same limitations (for better or worse!), and the effort to port to Wayland is orders of magnitude bigger, or sometimes in case of the multiwindow UI paradigm impossible to achieve to the same level of polish. Desktop environments of course have a design philosophy that they want to push, and want applications to integrate as much as possible (same as macOS and Windows!). However, there are many applications out there, and pushing a design via protocol limitations will likely just result in fewer apps.

The porting dilemma I spent probably way too much time looking into how to get applications cross-platform and running on Linux, often talking to vendors (FLOSS and proprietary) as well. Wayland limitations aren t the biggest issue by far, but they do start to come come up now, especially in the scientific space with Ubuntu having switched to Wayland by default. For application authors there is often no way to address these issues. Many scientists do not even understand why their Python script that creates some GUIs suddenly behaves weirdly because Qt is now using the Wayland backend on Ubuntu instead of X11. They do not know the difference and also do not want to deal with these details even though they may be programmers as well, the real goal is not to fiddle with the display server, but to get to a scientific result somehow. Another issue is portability layers like Wine which need to run Windows applications as-is on Wayland. Apparently Wine s Wayland driver has some heuristics to make window positioning work (and I am amazed by the work done on this!), but that can only go so far.

A way out? So, how would we actually solve this? Fundamentally, this excessively long blog post boils down to just one essential question: Do we want to force applications to submit to a UX paradigm unconditionally, potentially loosing out on application ports or keeping apps on X11 eternally, or do we want to throw them some rope to get as many applications ported over to Wayland, even through we might sacrifice some protocol purity? I think we really have to answer that to make the discussions on wayland-protocols a lot less grueling. This question can be answered at the wayland-protocols level, but even more so it must be answered by the individual desktops and compositors. If the answer for your environment turns out to be Yes, we want the Wayland protocol to be more opinionated and will not make any compromises for application portability , then your desktop/compositor should just immediately NACK protocols that add something like this and you simply shouldn t engage in the discussion, as you reject the very premise of the new protocol: That it has any merit to exist and is needed in the first place. In this case contributors to Wayland and application authors also know where you stand, and a lot of debate is skipped. Of course, if application authors want to support your environment, you are basically asking them now to rewrite their UI, which they may or may not do. But at least they know what to expect and how to target your environment. If the answer turns out to be We do want some portability , the next question obviously becomes where the line should be drawn and which changes are acceptable and which aren t. We can t blindly copy all X11 behavior, some porting work to Wayland is simply inevitable. Some written rules for that might be nice, but probably more importantly, if you agree fundamentally that there is an issue to be fixed, please engage in the discussions for the respective MRs! We for sure do not want to repeat X11 mistakes, and I am certain that we can implement protocols which provide the required functionality in a way that is a nice compromise in allowing applications a path forward into the Wayland future, while also being as good as possible and improving upon X11. For example, the toplevel-icon proposal is already a lot better than anything X11 ever had. Relaxing ACK requirements for the ext namespace is also a good proposed administrative change, as it allows some compositors to add features they want to support to the shared repository easier, while also not mandating them for others. In my opinion, it would allow for a lot less friction between the two different ideas of how Wayland protocol development should work. Some compositors could move forward and support more protocol extensions, while more restrictive compositors could support less things. Applications can detect supported protocols at launch and change their behavior accordingly (ideally even abstracted by toolkits). You may now say that a lot of apps are ported, so surely this issue can not be that bad. And yes, what Wayland provides today may be enough for 80-90% of all apps. But what I hope the detour into the research lab has done is convince you that this smaller percentage of apps matters. A lot. And that it may be worthwhile to support them. To end on a positive note: When it came to porting concrete apps over to Wayland, the only real showstoppers so far5 were the missing window-positioning and window-position-restore features. I encountered them when porting my own software, and I got the issue as feedback from colleagues and fellow engineers. In second place was UI testing and automation support, the window-icon issue was mentioned twice, but being a cosmetic issue it likely simply hurts people less and they can ignore it easier. What this means is that the majority of apps are already fine, and many others are very, very close! A Wayland future for everyone is within our grasp!  I will also bring my two protocol MRs to their conclusion for sure, because as application developers we need clarity on what the platform (either all desktops or even just a few) supports and will or will not support in future. And the only way to get something good done is by contribution and friendly discussion.

Footnotes
  1. Apologies for the clickbait-y title it comes with the subject
  2. When I talk about Wayland I mean the combined set of display server protocols and accepted protocol extensions, unless otherwise clarified.
  3. I would have picked a picture from our lab, but that would have needed permission first
  4. Qt has awesome platform issues pages, like for macOS and Linux/X11 which help with porting efforts, but Qt doesn t even list Linux/Wayland as supported platform. There is some information though, like window geometry peculiarities, which aren t particularly helpful when porting (but still essential to know).
  5. Besides issues with Nvidia hardware CUDA for simulations and machine-learning is pretty much everywhere, so Nvidia cards are common, which causes trouble on Wayland still. It is improving though.

4 January 2024

Aigars Mahinovs: Figuring out finances part 5

At the end of the last part of this, we got a Home Assistant OS installation that contains in itself a Firefly III instance and that contains all the current financial information. They are communicating and calculating predictions for me. The only part that I was not 100% happy with was accounting of cash transactions. You see payments in cash are mostly made away from computer and sometimes even in areas without a mobile Internet connection. And all Firefly III apps that I tried failed at the task of creating a new transaction when offline. Even the one recommended Telegram bot from Firefly III page used a dialog-based approach for creating even a simplest transaction. Issue asking for a one-shot transaction creation option stands as unresolved. Theoretically it would have been best if I could simply contribute that feature to that particular Telegram bot ... but it's written in Javascript. By mapping the API onto tasks somehow. After about 4 hours I still could not figure out where in the code anything actually happens. It all looked like just sugar or spagetty. Connectors on connectors on mappers. So I did the real open-source thing and just wrote my own tool. firefly3_telegram_oneshot is a maximally simple Telegram bot based on python-telegram-bot library. So what does it do? The primary usage for me is to simply send a message to the bot at any time with text like "23.2 coffee and cake" and when the message eventually reaches the bot, it then should create a new transaction from my "cash" account to "Unknown" account in amount of 23.20 and description "coffee and cake". That is the key. Everything else in the bot is comfort. For example "/undo" command deletes the last transaction in cash account (presumably added by error) and "/last" shows which transaction the "/undo" would delete. And to help with expense categorisation one can also do a message like "6.1 beer, dest=Edeka, cat=alcohol" that would search for a destination account that would fuzzy match to "Edeka" (a supermarket in Germany) and add the transation to the category fuzzy matched to "alcohol", like "Shopping - alcoholic drinks". And to make that fuzzy matching more reliable I also added "/cat something" and "/dest something" commands that would show which category or destination account would be matched with a given string. All of that in around 250 lines of Python code and executed by a 17 line Dockerfile (via the Portainer on my Home Assistant OS). One remaining function that could be nice is creating a category or destination account on request (for example when the first character of the supplied string is "+"). I am really plesantly surprised about how much can be done with how little code using the above Python library. And you never need to have any open incoming ports anywhere to runs such bots, so the attack surface for such bot-based service is much tighter. All in all the system works and works well. The only exception is that for my particual bank there is still no automatic way of extracting data about credit card transactions. For those I still have to manually log into the Internet bank, export a CSV file and then feed that into the Firefly III importer. Annoying. And I am not really motivated to try to hack my bank :D Has this been useful to any of you? Any ideas to expand or improve what I have? Just find me as "aigarius" on any social media and speak up :)

Michael Ablassmeier: Migrating a system to Hetzner cloud using REAR and kexec

I needed to migrate an existing system to an Hetzner cloud VPS. While it is possible to attach KVM consoles and custom ISO images to dedicated servers, i didn t find any way to do so with regular cloud instances. For system migrations i usually use REAR, which has never failed me. (and also has saved my ass during recovery multiple times). It s an awesome utility! It s possible to do this using the Hetzner recovery console too, but using REAR is very convenient here, because it handles things like re-creating the partition layout and network settings automatically! The steps are:

Example To create a rescue image on the source system:
apt install rear
echo OUTPUT=ISO > /etc/rear/local.conf
rear mkrescue -v
[..]
Wrote ISO image: /var/lib/rear/output/rear-debian12.iso (185M)
My source system had a 128 GB disk, so i registered an instance on Hetzner cloud with greater disk size to make things easier: image Now copy the ISO image to the newly created instance and extract its data:
 apt install kexec-tools
 scp rear-debian12.iso root@49.13.193.226:/tmp/
 modprobe loop
 mount -o loop rear-debian12.iso /mnt/
 cp /mnt/isolinux/kernel /tmp/
 cp /mnt/isolinux/initrd.cgz /tmp/
Install kexec if not installed already:
 apt install kexec-tools
Note down the current gateway configuration, this is required later on to make the REAR recovery console reachable via SSH:
root@testme:~# ip route
default via 172.31.1.1 dev eth0
172.31.1.1 dev eth0 scope link
Reboot the running VPS instance into the REAR recovery image using somewhat the same kernel cmdline:
root@testme:~# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-6.1.0-13-amd64 root=UUID=5174a81e-5897-47ca-8fe4-9cd19dc678c4 ro consoleblank=0 systemd.show_status=true console=tty1 console=ttyS0
kexec --initrd /tmp/initrd.cgz --command-line="consoleblank=0 systemd.show_status=true console=tty1 console=ttyS0" /tmp/kernel
Connection to 49.13.193.226 closed by remote host.
Connection to 49.13.193.226 closed
Now watch the system on the Console booting into the REAR system: image Login the recovery console (root without password) and fix its default route to make it reachable:
ip addr
[..]
2: enp1s0
..
$ ip route add 172.31.1.1 dev enp1s0
$ ip route add default via 172.31.1.1
ping 49.13.193.226
64 bytes from 49.13.193.226: icmp_seq=83 ttl=52 time=27.7 ms
The network configuration might differ, the source system in this example used DHCP, as the target does. If REAR detects changed static network configuration it guides you through the setup pretty nicely. Login via SSH (REAR will store your ssh public keys in the image) and start the recovery process, follow the steps as suggested by REAR:
ssh -l root 49.13.193.226
Welcome to Relax-and-Recover. Run "rear recover" to restore your system !
RESCUE debian12:~ # rear recover
Relax-and-Recover 2.7 / Git
Running rear recover (PID 673 date 2024-01-04 19:20:22)
Using log file: /var/log/rear/rear-debian12.log
Running workflow recover within the ReaR rescue/recovery system
Will do driver migration (recreating initramfs/initrd)
Comparing disks
Device vda does not exist (manual configuration needed)
Switching to manual disk layout configuration (GiB sizes rounded down to integer)
/dev/vda had size 137438953472 (128 GiB) but it does no longer exist
/dev/sda was not used on the original system and has now 163842097152 (152 GiB)
Original disk /dev/vda does not exist (with same size) in the target system
Using /dev/sda (the only available of the disks) for recreating /dev/vda
Current disk mapping table (source => target):
  /dev/vda => /dev/sda
Confirm or edit the disk mapping
1) Confirm disk mapping and continue 'rear recover'
[..]
User confirmed recreated disk layout
[..]
This step re-recreates your original disk layout and mounts it to /mnt/local/ (this example uses a pretty lame layout, but usually REAR will handle things like lvm/btrfs just nicely):
mount
/dev/sda3 on /mnt/local type ext4 (rw,relatime,errors=remount-ro)
/dev/sda1 on /mnt/local/boot type ext4 (rw,relatime)
Now clone your source systems data to /mnt/local/ with whatever utility you like to use and exit the recovery step. After confirming everything went well, REAR will setup the bootloader (and all other config details like fstab entries and adjusted network configuration) for you as required:
rear> exit
Did you restore the backup to /mnt/local ? Are you ready to continue recovery ? yes
User confirmed restored files
Updated initramfs with new drivers for this system.
Skip installing GRUB Legacy boot loader because GRUB 2 is installed (grub-probe or grub2-probe exist).
Installing GRUB2 boot loader...
Determining where to install GRUB2 (no GRUB2_INSTALL_DEVICES specified)
Found possible boot disk /dev/sda - installing GRUB2 there
Finished 'recover'. The target system is mounted at '/mnt/local'.
Exiting rear recover (PID 7103) and its descendant processes ...
Running exit tasks
Now reboot the recovery console and watch it boot into your target systems configuration: image Being able to use this procedure for complete disaster recovery within Hetzner cloud VPS (using off-site backups) gives me a better feeling, too.

3 January 2024

John Goerzen: Live Migrating from Raspberry Pi OS bullseye to Debian bookworm

I ve been getting annoyed with Raspberry Pi OS (Raspbian) for years now. It s a fork of Debian, but manages to omit some of the most useful things. So I ve decided to migrate all of my Pis to run pure Debian. These are my reasons:
  1. Raspberry Pi OS has, for years now, specified that there is no upgrade path. That is, to get to a newer major release, it s a reinstall. While I have sometimes worked around this, for a device that is frequently installed in hard-to-reach locations, this is even more important than usual. It s common for me to upgrade machines for a decade or more across Debian releases and there s no reason that it should be so much more difficult with Raspbian.
  2. As I noted in Consider Security First, the security situation for Raspberry Pi OS isn t as good as it is with Debian.
  3. Raspbian lags behind Debian often times by 6 months or more for major releases, and days or weeks for bug fixes and security patches.
  4. Raspbian has no direct backports support, though Raspberry Pi 3 and above can use Debian s backports (per my instructions as Installing Debian Backports on Raspberry Pi)
  5. Raspbian uses a custom kernel without initramfs support
It turns out it is actually possible to do an in-place migration from Raspberry Pi OS bullseye to Debian bookworm. Here I will describe how. Even if you don t have a Raspberry Pi, this might still be instructive on how Raspbian and Debian packages work.

WARNINGS Before continuing, back up your system. This process isn t for the neophyte and it is entirely possible to mess up your boot device to the point that you have to do a fresh install to get your Pi to boot. This isn t a supported process at all.

Architecture Confusion Debian has three ARM-based architectures:
  • armel, for the lowest-end 32-bit ARM devices without hardware floating point support
  • armhf, for the higher-end 32-bit ARM devices with hardware float (hence hf )
  • arm64, for 64-bit ARM devices (which all have hardware float)
Although the Raspberry Pi 0 and 1 do support hardware float, they lack support for other CPU features that Debian s armhf architecture assumes. Therefore, the Raspberry Pi 0 and 1 could only run Debian s armel architecture. Raspberry Pi 3 and above are capable of running 64-bit, and can run both armhf and arm64. Prior to the release of the Raspberry Pi 5 / Raspbian bookworm, Raspbian only shipped the armhf architecture. Well, it was an architecture they called armhf, but it was different from Debian s armhf in that everything was recompiled to work with the more limited set of features on the earlier Raspberry Pi boards. It was really somewhere between Debian s armel and armhf archs. You could run Debian armel on those, but it would run more slowly, due to doing floating point calculations without hardware support. Debian s raspi FAQ goes into this a bit. What I am going to describe here is going from Raspbian armhf to Debian armhf with a 64-bit kernel. Therefore, it will only work with Raspberry Pi 3 and above. It may theoretically be possible to take a Raspberry Pi 2 to Debian armhf with a 32-bit kernel, but I haven t tried this and it may be more difficult. I have seen conflicting information on whether armhf really works on a Pi 2. (If you do try it on a Pi 2, ignore everything about arm64 and 64-bit kernels below, and just go with the linux-image-armmp-lpae kernel per the ARMMP page) There is another wrinkle: Debian doesn t support running 32-bit ARM kernels on 64-bit ARM CPUs, though it does support running a 32-bit userland on them. So we will wind up with a system with kernel packages from arm64 and everything else from armhf. This is a perfectly valid configuration as the arm64 like x86_64 is multiarch (that is, the CPU can natively execute both the 32-bit and 64-bit instructions). (It is theoretically possible to crossgrade a system from 32-bit to 64-bit userland, but that felt like a rather heavy lift for dubious benefit on a Pi; nevertheless, if you want to make this process even more complicated, refer to the CrossGrading page.)

Prerequisites and Limitations In addition to the need for a Raspberry Pi 3 or above in order for this to work, there are a few other things to mention. If you are using the GPIO features of the Pi, I don t know if those work with Debian. I think Raspberry Pi OS modified the desktop environment more than other components. All of my Pis are headless, so I don t know if this process will work if you use a desktop environment. I am assuming you are booting from a MicroSD card as is typical in the Raspberry Pi world. The Pi s firmware looks for a FAT partition (MBR type 0x0c) and looks within it for boot information. Depending on how long ago you first installed an OS on your Pi, your /boot may be too small for Debian. Use df -h /boot to see how big it is. I recommend 200MB at minimum. If your /boot is smaller than that, stop now (or use some other system to shrink your root filesystem and rearrange your partitions; I ve done this, but it s outside the scope of this article.) You need to have stable power. Once you begin this process, your pi will mostly be left in a non-bootable state until you finish. (You did make a backup, right?)

Basic idea The basic idea here is that since bookworm has almost entirely newer packages then bullseye, we can just switch over to it and let the Debian packages replace the Raspbian ones as they are upgraded. Well, it s not quite that easy, but that s the main idea.

Preparation First, make a backup. Even an image of your MicroSD card might be nice. OK, I think I ve said that enough now. It would be a good idea to have a HDMI cable (with the appropriate size of connector for your particular Pi board) and a HDMI display handy so you can troubleshoot any bootup issues with a console.

Preparation: access The Raspberry Pi OS by default sets up a user named pi that can use sudo to gain root without a password. I think this is an insecure practice, but assuming you haven t changed it, you will need to ensure it still works once you move to Debian. Raspberry Pi OS had a patch in their sudo package to enable it, and that will be removed when Debian s sudo package is installed. So, put this in /etc/sudoers.d/010_picompat:
pi ALL=(ALL) NOPASSWD: ALL
Also, there may be no password set for the root account. It would be a good idea to set one; it makes it easier to log in at the console. Use the passwd command as root to do so.

Preparation: bluetooth Debian doesn t correctly identify the Bluetooth hardware address. You can save it off to a file by running hcitool dev > /root/bluetooth-from-raspbian.txt. I don t use Bluetooth, but this should let you develop a script to bring it up properly.

Preparation: Debian archive keyring You will next need to install Debian s archive keyring so that apt can authenticate packages from Debian. Go to the bookworm download page for debian-archive-keyring and copy the URL for one of the files, then download it on the pi. For instance:
wget http://http.us.debian.org/debian/pool/main/d/debian-archive-keyring/debian-archive-keyring_2023.3+deb12u1_all.deb
Use sha256sum to verify the checksum of the downloaded file, comparing it to the package page on the Debian site. Now, you ll install it with:
dpkg -i debian-archive-keyring_2023.3+deb12u1_all.deb

Package first steps From here on, we are making modifications to the system that can leave it in a non-bootable state. Examine /etc/apt/sources.list and all the files in /etc/apt/sources.list.d. Most likely you will want to delete or comment out all lines in all files there. Replace them with something like:
deb http://deb.debian.org/debian/ bookworm main non-free-firmware contrib non-free
deb http://security.debian.org/debian-security bookworm-security main non-free-firmware contrib non-free
deb https://deb.debian.org/debian bookworm-backports main non-free-firmware contrib non-free
(you might leave off contrib and non-free depending on your needs) Now, we re going to tell it that we ll support arm64 packages:
dpkg --add-architecture arm64
And finally, download the bookworm package lists:
apt-get update
If there are any errors from that command, fix them and don t proceed until you have a clean run of apt-get update.

Moving /boot to /boot/firmware The boot FAT partition I mentioned above is mounted at /boot by Raspberry Pi OS, but Debian s scripts assume it will be at /boot/firmware. We need to fix this. First:
umount /boot
mkdir /boot/firmware
Now, edit fstab and change the reference to /boot to be to /boot/firmware. Now:
mount -v /boot/firmware
cd /boot/firmware
mv -vi * ..
This mounts the filesystem at the new location, and moves all its contents back to where apt believes it should be. Debian s packages will populate /boot/firmware later.

Installing the first packages Now we start by installing the first of the needed packages. Eventually we will wind up with roughly the same set Debian uses.
apt-get install linux-image-arm64
apt-get install firmware-brcm80211=20230210-5
apt-get install raspi-firmware
If you get errors relating to firmware-brcm80211 from any commands, run that install firmware-brcm80211 command and then proceed. There are a few packages that Raspbian marked as newer than the version in bookworm (whether or not they really are), and that s one of them.

Configuring the bootloader We need to configure a few things in /etc/default/raspi-firmware before proceeding. Edit that file. First, uncomment (or add) a line like this:
KERNEL_ARCH="arm64"
Next, in /boot/cmdline.txt you can find your old Raspbian boot command line. It will say something like:
root=PARTUUID=...
Save off the bit starting with PARTUUID. Back in /etc/default/raspi-firmware, set a line like this:
ROOTPART=PARTUUID=abcdef00
(substituting your real value for abcdef00). This is necessary because the microSD card device name often changes from /dev/mmcblk0 to /dev/mmcblk1 when switching to Debian s kernel. raspi-firmware will encode the current device name in /boot/firmware/cmdline.txt by default, which will be wrong once you boot into Debian s kernel. The PARTUUID approach lets it work regardless of the device name.

Purging the Raspbian kernel Run:
dpkg --purge raspberrypi-kernel

Upgrading the system At this point, we are going to run the procedure beginning at section 4.4.3 of the Debian release notes. Generally, you will do:
apt-get -u upgrade
apt full-upgrade
Fix any errors at each step before proceeding to the next. Now, to remove some cruft, run:
apt-get --purge autoremove
Inspect the list to make sure nothing important isn t going to be removed.

Removing Raspbian cruft You can list some of the cruft with:
apt list '~o'
And remove it with:
apt purge '~o'
I also don t run Bluetooth, and it seemed to sometimes hang on boot becuase I didn t bother to fix it, so I did:
apt-get --purge remove bluez

Installing some packages This makes sure some basic Debian infrastructure is available:
apt-get install wpasupplicant parted dosfstools wireless-tools iw alsa-tools
apt-get --purge autoremove

Installing firmware Now run:
apt-get install firmware-linux

Resolving firmware package version issues If it gives an error about the installed version of a package, you may need to force it to the bookworm version. For me, this often happened with firmware-atheros, firmware-libertas, and firmware-realtek. Here s how to resolve it, with firmware-realtek as an example:
  1. Go to https://packages.debian.org/PACKAGENAME for instance, https://packages.debian.org/firmware-realtek. Note the version number in bookworm in this case, 20230210-5.
  2. Now, you will force the installation of that package at that version:
    apt-get install firmware-realtek=20230210-5
    
  3. Repeat with every conflicting package until done.
  4. Rerun apt-get install firmware-linux and make sure it runs cleanly.
Also, in the end you should be able to:
apt-get install firmware-atheros firmware-libertas firmware-realtek firmware-linux

Dealing with other Raspbian packages The Debian release notes discuss removing non-Debian packages. There will still be a few of those. Run:
apt list '?narrow(?installed, ?not(?origin(Debian)))'
Deal with them; mostly you will need to force the installation of a bookworm version using the procedure in the section Resolving firmware package version issues above (even if it s not for a firmware package). For non-firmware packages, you might possibly want to add --mark-auto to your apt-get install command line to allow the package to be autoremoved later if the things depending on it go away. If you aren t going to use Bluetooth, I recommend apt-get --purge remove bluez as well. Sometimes it can hang at boot if you don t fix it up as described above.

Set up networking We ll be switching to the Debian method of networking, so we ll create some files in /etc/network/interfaces.d. First, eth0 should look like this:
allow-hotplug eth0
iface eth0 inet dhcp
iface eth0 inet6 auto
And wlan0 should look like this:
allow-hotplug wlan0
iface wlan0 inet dhcp
    wpa-conf /etc/wpa_supplicant/wpa_supplicant.conf
Raspbian is inconsistent about using eth0/wlan0 or renamed interface. Run ifconfig or ip addr. If you see a long-named interface such as enx<something> or wlp<something>, copy the eth0 file to the one named after the enx interface, or the wlan0 file to the one named after the wlp interface, and edit the internal references to eth0/wlan0 in this new file to name the long interface name. If using wifi, verify that your SSIDs and passwords are in /etc/wpa_supplicant/wpa_supplicant.conf. It should have lines like:
network= 
   ssid="NetworkName"
   psk="passwordHere"
 
(This is where Raspberry Pi OS put them).

Deal with DHCP Raspberry Pi OS used dhcpcd, whereas bookworm normally uses isc-dhcp-client. Verify the system is in the correct state:
apt-get install isc-dhcp-client
apt-get --purge remove dhcpcd dhcpcd-base dhcpcd5 dhcpcd-dbus

Set up LEDs To set up the LEDs to trigger on MicroSD activity as they did with Raspbian, follow the Debian instructions. Run apt-get install sysfsutils. Then put this in a file at /etc/sysfs.d/local-raspi-leds.conf:
class/leds/ACT/brightness = 1
class/leds/ACT/trigger = mmc1

Prepare for boot To make sure all the /boot/firmware files are updated, run update-initramfs -u. Verify that root in /boot/firmware/cmdline.txt references the PARTUUID as appropriate. Verify that /boot/firmware/config.txt contains the lines arm_64bit=1 and upstream_kernel=1. If not, go back to the section on modifying /etc/default/raspi-firmware and fix it up.

The moment arrives Cross your fingers and try rebooting into your Debian system:
reboot
For some reason, I found that the first boot into Debian seems to hang for 30-60 seconds during bootstrap. I m not sure why; don t panic if that happens. It may be necessary to power cycle the Pi for this boot.

Troubleshooting If things don t work out, hook up the Pi to a HDMI display and see what s up. If I anticipated a particular problem, I would have documented it here (a lot of the things I documented here are because I ran into them!) So I can t give specific advice other than to watch boot messages on the console. If you don t even get kernel messages going, then there is some problem with your partition table or /boot/firmware FAT partition. Otherwise, you ve at least got the kernel going and can troubleshoot like usual from there.

2 January 2024

Matthew Garrett: Dealing with weird ELF libraries

Libraries are collections of code that are intended to be usable by multiple consumers (if you're interested in the etymology, watch this video). In the old days we had what we now refer to as "static" libraries, collections of code that existed on disk but which would be copied into newly compiled binaries. We've moved beyond that, thankfully, and now make use of what we call "dynamic" or "shared" libraries - instead of the code being copied into the binary, a reference to the library function is incorporated, and at runtime the code is mapped from the on-disk copy of the shared object[1]. This allows libraries to be upgraded without needing to modify the binaries using them, and if multiple applications are using the same library at once it only requires that one copy of the code be kept in RAM.

But for this to work, two things are necessary: when we build a binary, there has to be a way to reference the relevant library functions in the binary; and when we run a binary, the library code needs to be mapped into the process.

(I'm going to somewhat simplify the explanations from here on - things like symbol versioning make this a bit more complicated but aren't strictly relevant to what I was working on here)

For the first of these, the goal is to replace a call to a function (eg, printf()) with a reference to the actual implementation. This is the job of the linker rather than the compiler (eg, if you use the -c argument to tell gcc to simply compile to an object rather than linking an executable, it's not going to care about whether or not every function called in your code actually exists or not - that'll be figured out when you link all the objects together), and the linker needs to know which symbols (which aren't just functions - libraries can export variables or structures and so on) are available in which libraries. You give the linker a list of libraries, it extracts the symbols available, and resolves the references in your code with references to the library.

But how is that information extracted? Each ELF object has a fixed-size header that contains references to various things, including a reference to a list of "section headers". Each section has a name and a type, but the ones we're interested in are .dynstr and .dynsym. .dynstr contains a list of strings, representing the name of each exported symbol. .dynsym is where things get more interesting - it's a list of structs that contain information about each symbol. This includes a bunch of fairly complicated stuff that you need to care about if you're actually writing a linker, but the relevant entries for this discussion are an index into .dynstr (which means the .dynsym entry isn't sufficient to know the name of a symbol, you need to extract that from .dynstr), along with the location of that symbol within the library. The linker can parse this information and obtain a list of symbol names and addresses, and can now replace the call to printf() with a reference to libc instead.

(Note that it's not possible to simply encode this as "Call this address in this library" - if the library is rebuilt or is a different version, the function could move to a different location)

Experimentally, .dynstr and .dynsym appear to be sufficient for linking a dynamic library at build time - there are other sections related to dynamic linking, but you can link against a library that's missing them. Runtime is where things get more complicated.

When you run a binary that makes use of dynamic libraries, the code from those libraries needs to be mapped into the resulting process. This is the job of the runtime dynamic linker, or RTLD[2]. The RTLD needs to open every library the process requires, map the relevant code into the process's address space, and then rewrite the references in the binary into calls to the library code. This requires more information than is present in .dynstr and .dynsym - at the very least, it needs to know the list of required libraries.

There's a separate section called .dynamic that contains another list of structures, and it's the data here that's used for this purpose. For example, .dynamic contains a bunch of entries of type DT_NEEDED - this is the list of libraries that an executable requires. There's also a bunch of other stuff that's required to actually make all of this work, but the only thing I'm going to touch on is DT_HASH. Doing all this re-linking at runtime involves resolving the locations of a large number of symbols, and if the only way you can do that is by reading a list from .dynsym and then looking up every name in .dynstr that's going to take some time. The DT_HASH entry points to a hash table - the RTLD hashes the symbol name it's trying to resolve, looks it up in that hash table, and gets the symbol entry directly (it still needs to resolve that against .dynstr to make sure it hasn't hit a hash collision - if it has it needs to look up the next hash entry, but this is still generally faster than walking the entire .dynsym list to find the relevant symbol). There's also DT_GNU_HASH which fulfills the same purpose as DT_HASH but uses a more complicated algorithm that performs even better. .dynamic also contains entries pointing at .dynstr and .dynsym, which seems redundant but will become relevant shortly.

So, .dynsym and .dynstr are required at build time, and both are required along with .dynamic at runtime. This seems simple enough, but obviously there's a twist and I'm sorry it's taken so long to get to this point.

I bought a Synology NAS for home backup purposes (my previous solution was a single external USB drive plugged into a small server, which had uncomfortable single point of failure properties). Obviously I decided to poke around at it, and I found something odd - all the libraries Synology ships were entirely lacking any ELF section headers. This meant no .dynstr, .dynsym or .dynamic sections, so how was any of this working? nm asserted that the libraries exported no symbols, and readelf agreed. If I wrote a small app that called a function in one of the libraries and built it, gcc complained that the function was undefined. But executables on the device were clearly resolving the symbols at runtime, and if I loaded them into ghidra the exported functions were visible. If I dlopen()ed them, dlsym() couldn't resolve the symbols - but if I hardcoded the offset into my code, I could call them directly.

Things finally made sense when I discovered that if I passed the --use-dynamic argument to readelf, I did get a list of exported symbols. It turns out that ELF is weirder than I realised. As well as the aforementioned section headers, ELF objects also include a set of program headers. One of the program header types is PT_DYNAMIC. This typically points to the same data that's present in the .dynamic section. Remember when I mentioned that .dynamic contained references to .dynsym and .dynstr? This means that simply pointing at .dynamic is sufficient, there's no need to have separate entries for them.

The same information can be reached from two different locations. The information in the section headers is used at build time, and the information in the program headers at run time[3]. I do not have an explanation for this. But if the information is present in two places, it seems obvious that it should be able to reconstruct the missing section headers in my weird libraries? So that's what this does. It extracts information from the DYNAMIC entry in the program headers and creates equivalent section headers.

There's one thing that makes this more difficult than it might seem. The section header for .dynsym has to contain the number of symbols present in the section. And that information doesn't directly exist in DYNAMIC - to figure out how many symbols exist, you're expected to walk the hash tables and keep track of the largest number you've seen. Since every symbol has to be referenced in the hash table, once you've hit every entry the largest number is the number of exported symbols. This seemed annoying to implement, so instead I cheated, added code to simply pass in the number of symbols on the command line, and then just parsed the output of readelf against the original binaries to extract that information and pass it to my tool.

Somehow, this worked. I now have a bunch of library files that I can link into my own binaries to make it easier to figure out how various things on the Synology work. Now, could someone explain (a) why this information is present in two locations, and (b) why the build-time linker and run-time linker disagree on the canonical source of truth?

[1] "Shared object" is the source of the .so filename extension used in various Unix-style operating systems
[2] You'll note that "RTLD" is not an acryonym for "runtime dynamic linker", because reasons
[3] For environments using the GNU RTLD, at least - I have no idea whether this is the case in all ELF environments

comment count unavailable comments

Thomas Koch: Waiting for a STATE folder in the XDG basedir spec

Posted on February 18, 2014
The XDG Basedirectory specification proposes default homedir folders for the categories DATA (~/.local/share), CONFIG (~/.config) and CACHE (~/.cache). One category however is missing: STATE. This category has been requested several times but nothing happened. Examples for state data are: The missing STATE category is especially annoying if you re managing your dotfiles with a VCS (e.g. via VCSH) and you care to keep your homedir tidy. If you re as annoyed as me about the missing STATE category, please voice your opinion on the XDG mailing list. Of course it s a very long way until applications really use such a STATE directory. But without a common standard it will never happen.

1 January 2024

Tim Retout: Prevent DOM-XSS with Trusted Types - a smarter DevSecOps approach

It can be incredibly easy for a frontend developer to accidentally write a client-side cross-site-scripting (DOM-XSS) security issue, and yet these are hard for security teams to detect. Vulnerability scanners are slow, and suffer from false positives. Can smarter collaboration between development, operations and security teams provide a way to eliminate these problems altogether? Google claims that Trusted Types has all but eliminated DOM-XSS exploits on those of their sites which have implemented it. Let s find out how this can work!

DOM-XSS vulnerabilities are easy to write, but hard for security teams to catch It is very easy to accidentally introduce a client-side XSS problem. As an example of what not to do, suppose you are setting an element s text to the current URL, on the client side:
// Don't do this
para.innerHTML = location.href;
Unfortunately, an attacker can now manipulate the URL (and e.g. send this link in a phishing email), and any HTML tags they add will be interpreted by the user s browser. This could potentially be used by the attacker to send private data to a different server. Detecting DOM-XSS using vulnerability scanning tools is challenging - typically this requires crawling each page of the website and attempting to detect problems such as the one above, but there is a significant risk of false positives, especially as the complexity of the logic increases. There are already ways to avoid these exploits developers should validate untrusted input before making use of it. There are libraries such as DOMPurify which can help with sanitization.1 However, if you are part of a security team with responsibility for preventing these issues, it can be complex to understand whether you are at risk. Different developer teams may be using different techniques and tools. It may be impossible for you to work closely with every developer so how can you know that the frontend team have used these libraries correctly?

Trusted Types closes the DevSecOps feedback loop for DOM-XSS, by allowing Ops and Security to verify good Developer practices Trusted Types enforces sanitization in the browser2, by requiring the web developer to assign a particular kind of JavaScript object rather than a native string to .innerHTML and other dangerous properties. Provided these special types are created in an appropriate way, then they can be trusted not to expose XSS problems. This approach will work with whichever tools the frontend developers have chosen to use, and detection of issues can be rolled out by infrastructure engineers without requiring frontend code changes.

Content Security Policy allows enforcement of security policies in the browser itself Because enforcing this safer approach in the browser for all websites would break backwards-compatibility, each website must opt-in through Content Security Policy headers. Content Security Policy (CSP) is a mechanism that allows web pages to restrict what actions a browser should execute on their page, and a way for the site to receive reports if the policy is violated. Diagram showing a browser communicating with a web server. Content-Security-Policy headers are returned by the URL &ldquo;/&rdquo;, and the browser reports any security violations to &ldquo;/csp&rdquo;. Figure 1: Content-Security-Policy browser communication This is revolutionary, because it allows servers to receive feedback in real time on errors that may be appearing in the browser s console.

Trusted Types can be rolled out incrementally, with continuous feedback Web.dev s article on Trusted Types explains how to safely roll out the feature using the features of CSP itself:
  • Deploy a CSP collector if you haven t already
  • Switch on CSP reports without enforcement (via Content-Security-Policy-Report-Only headers)
  • Iteratively review and fix the violations
  • Switch to enforcing mode when there are a low enough rate of reports
Static analysis in a continuous integration pipeline is also sensible you want to prevent regressions shipping in new releases before they trigger a flood of CSP reports. This will also give you a chance of finding any low-traffic vulnerable pages.

Smart security teams will use techniques like Trusted Types to eliminate entire classes of bugs at a time Rather than playing whack-a-mole with unreliable vulnerability scanning or bug bounties, techniques such as Trusted Types are truly in the spirit of Secure by Design build high quality in from the start of the engineering process, and do this in a way which closes the DevSecOps feedback loop between your Developer, Operations and Security teams.

  1. Sanitization libraries are especially needed when the examples become more complex, e.g. if the application must manipulate the input. DOMPurify version 1.0.9 also added Trusted Types support, so can still be used to help developers adopt this feature.
  2. Trusted Types has existed in Chrome and Edge since 2020, and should soon be coming to Firefox as well. However, it s not necessary to wait for Firefox or Safari to add support, because the large market share of Chrome and Edge will let you identify and fix your site s DOM-XSS issues, even if you do not set enforcing mode, and users of all browsers will benefit. Even so, it is great that Mozilla is now on board.

30 December 2023

Riku Voipio: Adguard DNS, or how to reduce ads without apps/extensions

Looking at the options for blocking ads, people usually first look at browser extensions. Google's plan is to disable adblock extensions in 2024. The alternative is usually an app (on phones) or a "VPN" that does filtering for you. All these methods are quite heavyweight, and require installing software on your phone or PC. What is less known, is that you can you DNS-over-TLS or DNS-over-HTTPS for ad blocking.
What is DNS-over-TLS and DNS-over-HTTPS
Since Android 9, Google has provided a setting calledPrivate DNS. Traditional DNS is unencrypted UDP so anyone can monitor your requests and/or return false records. With private DNS, DNS-over-TLS or DNS-over-HTTPS is used to guarantee the DNS request is sent to the server you configured. Which Google hopes is of course Google's own public servers. If you do so, your ISP and hotspot providers no longer can monitor, monetize and enshittify your DNS requests - only Google can do so.
Subverting private DNS for ad blocking
This is where AdGuard DNS comes useful. By setting the AdGuard DNS server as your "private DNS" server following the instructions,you can start blocking right away. Note, on PC you can also configure the Adguard DNS server on the Browser settings (Firefox -> Enable secure DNS and Chrome -> Use Secure DNS) instead of configuring a system-wide DNS server. Blocking via DNS, of course, limits effectiveness to ads distributed from 3rd party servers.
Other uses for AdGuard DNS
If you register for Adguard DNS, you get your "own", customizable DNS server address to point to. You can, for example, create your own /etc/hosts style records that are now available to all you devices you have connected to the Adguard DNS server - whether your a are home or not. Of course, you choose to use the personal DNS server, your DNS query privacy is in the hands of AdGuard.
Going further
What else is ruining the web than Ads? Well commercial social media. An article ("Ei n in! Algoritmi hky") from the latest Finnish Magazine SKROLLI (mainos: jos luet suomeksi, Tilaa skrolli!) hit a chord for me. The algorithms of social media sites are designed not to serve you, but to addict you. For example, If you stop to watch a hateful meme image, the algorithm will record "The user spent time watching this, show more of the same!". It doesn't help block or mute - yeah that spefic hate engager will be blocked, but all the dozens similar hate pages will still be shown to you. Worse, the social media sites are being overrun by AI-generated crap. Unfortunately the addictive nature of the algorithms works. You reload in vain, hoping this time the algorithmic god will show something your friends share. How do you cure addiction? By blocking yourself out:
Epilogue
I didn't block myself out of Fediverse - yet. It's not engineered to be addictive, which is also probably why it isn't as popular as the commercial alternatives...

29 December 2023

Ulrike Uhlig: How do kids conceive the internet? - part 4

Read all parts of the series Part 1 // Part 2 // Part 3 // Part 4 I ve been wanting to write this post for over a year, but lacked energy and time. Before 2023 is coming to an end, I want to close this series and share some more insights with you and hopefully provide you with a smile here and there. For this round of interviews, four more kids around the ages of 8 to 13 were interviewed, 3 of them have a US background these 3 interviews were done by a friend who recorded these interviews for me, thank you! As opposed to the previous interviews, these four kids have parents who have a more technical professional background. And this seems to make a difference: even though none of these kids actually knew much better how the internet really works than the other kids that I interviewed, specifically in terms of physical infrastructures, they were much more confident in using the internet, they were able to more correctly name things they see on the internet, and they had partly radical ideas about what they would like to learn or what they would want to change about the internet! Looking at these results, I think it s safe to say that social reproduction is at work and that we need to improve education for kids who do not profit from this type of social and cultural wealth at home. But let s dive into the details.

The boy and the aliens (I ll be mostly transribing the interview, which was short, and which I find difficult to sum up because some of the questions are written in a way to encourage the kids to tell a story, and this particular kid had a thing going on with aliens.) He s a 13 year old boy living in the US. He has his own computer, which technically belongs to his school but can be used by him freely and he can also take it home. He s the first kid saying he s reading the news on the internet; he does not actually use social media, besides sometimes watching TikTok. When asked: Imagine that aliens land and come to you and say: We ve heard about this internet thing you all talk about, what is it? What do you tell them? he replied:
Well, I mean they re aliens, so I don t know if I wanna tell them much.
(Parents laughing in the background.) Let s assume they re friendly aliens.
Well, I would say you can look anything up and play different games. And there are alien games. But mostly the enemies are aliens which you might be a little offended by. And you can get work done, if you needed to spy on humans. There s cameras, you can film yourself, yeah. And you can text people and call people who are far away
And what would be in a drawing that would explain the internet? Google, an alien using Twitch, Google search results, and the interface of an IM software on an iPhone drawn by a 13 year old boy And here s what he explains about his drawing:
First, I would draw what I see when you open a new tab, Google.
On the right side of the drawing we see something like Twitch.
I don t wanna offend the aliens, but you can film yourself playing a game, so here is the alien and he s playing a game.
And then you can ask questions like: How did aliens come to the Earth? And the answer will be here (below). And there ll be different websites that you can click on.
And you can also look up Who won the alien contest? And that would be Usmushgagu, and that guy won the alien contest.
Do you think the information about alien intergalactic football is already on the internet?
Yeah! That s how fast the internet is.
On the bottom of the drawing we see an iPhone and an instant messaging software.
There s also a device called an iPhone and with it you can text your friends. So here s the alien asking: How was ur day? and the friend might answer IDK [I don t know].
Imagine that a wise and friendly dragon could teach you one thing about the internet that you ve always wanted to know. What would you ask the dragon to teach you about?
Is there a way you don t have to pay for any channels or subscriptions and you can get through any firewall?
Imagine you could make the internet better for everyone. What would you do first?
Well you wouldn t have to pay for it [paywalls].
Can you describe what happens between your device and a website when you visit a website?
Well, it takes 0.025 seconds. [ ] It s connecting.
Wow, that s indeed fast! We were not able to obtain more details about what is that fast thing that s happening exactly

The software engineer s kid This kid identifies as neither boy nor girl, is 10 years old and lives in Germany. Their father works as a software engineer, or in the words of the child:
My dad knows everything.
The kid has a laptop and a mobile phone, both with parental control they don t think that the controlling is fair. This kid uses the internet foremostly for listening to music and watching prank channels on Youtube but also to work with Purple Mash (a teaching platform for the computing curriculum used at their school), finding 3d printing models (that they ask their father to print with them because they did not manage to use the printer by themselves yet). Interestingly, and very differently from the non-tech-parent kids, this kid insists on using Firefox and Signal - the latter is not only used by their dad to tell them to come downstairs for dinner, but also to call their grandmother. This kid also shops online, with the help of the father who does the actual shopping for them using money that the kid earned by reading books. If you would need to explain to an alien who has landed on Earth what the internet is, what would you tell them?
The internet is something where you search, for example, you can look for music. You can also watch videos from around the world, and you can program stuff.
Like most of the kids interviewed, this kid uses the internet mostly for media consumption, but with the difference that they also engage with technology by way of programming using Purple Mash. drawing of the internet by a 10 year old showing a Youtube prank channel, an external device trackpad, and headphones In their drawing we see a Youtube prank channel on a screen, an external trackpad on the right (likely it s not a touch screen), and headphones. Notice how there is no keyboard, or maybe it s folded away. If you could ask a nice and friendly dragon anything you d like to learn about the internet, what would it be?
How do I shutdown my dad s computer forever?
And what is it that he would do to improve the internet for everyone? Contrary to the kid living in the US, they think that
It takes too much time to load stuff!
I wonder if this kid experiences the internet as being slow because they use the mobile network or because their connection somehow gets throttled as a way to control media consumption, or if the German internet infrastructure is just so much worse in certain regions If you could improve the internet for everyone, what would you do first? I d make a new Firefox app that loads the internet much faster.

The software engineer s daughter This girl is only 8 years old, she hates unicorns, and her dad is also a software engineer. She uses a smartphone, controlled by her parents. My impression of the interview is that at this age, kids slightly mix up the internet with the devices that they use to access the internet. drawing of the internet by an 8 year old girl, Showing Google and the interface to call and text someone In her drawing, we see again Google - it s clearly everywhere - and also the interfaces for calling and texting someone. To explain what the internet is, besides the fact that one can use it for calling and listening to music, she says:
[The internet] is something that you can [use to] see someone who is far away, so that you don t need to take time to get to them.
Now, that s a great explanation, the internet providing the possibility for communication over a distance :) If she could ask a friendly dragon something she always wanted to know, she d ask how to make her phone come alive:
that it can talk to you, that it can see you, that it can smile and has eyes. It s like a new family member, you can talk to it.
Sounds a bit like Siri, Alexa, or Furby, doesn t it? If you could improve the internet for everyone, what would you do first? She d have the phone be able to decide over her free time, her phone time. That would make the world better, not for the kids, but certainly for the parents.

The antifascist kid This German boy s dad has a background in electrotechnical engineering. He s 10 years old and he told me he s using the internet a lot for searching things for example about his passion: the firefighters. For him, the internet is:
An invisible world. A virtual world. But there s also the darknet.
He told me he always watches that German show on public TV for kids that explains stuff: Checker Tobi. (In 2014, Checker Tobi actually produced an episode about the internet, which I d criticize for having only male characters, except for one female character: a secretary Google, a nice and friendly woman guiding the way through the huge library that s the internet ) This kid was the only one interviewed who managed to actually explain something about the internet, or rather about the hypertextual structure of the web. When I asked him to draw the internet, he made a drawing of a pin board. He explained:
Many items are attached to the pin board, and on the top left corner there s a computer, for example with Youtube and one can navigate like that between all the items, and start again from the beginning when done.
hypertext structure representing the internet drawn by a kid When I asked if he knew what actually happens between the device and a website he visits, he put forth the hypothesis of the existence of some kind of
Waves, internet waves - all this stuff somehow needs to be transmitted.
What he d like to learn:
How to get into the darknet? How do you become a Whitehat? I ve heard these words on the internet, the internet makes me clever.
And what would he change on the internet if he could?
I want that right wing extreme stuff is not accessible anymore, or at least, that it rains turds ( Kackw rste ) whenever people watch such stuff. Or that people are always told: This video is scum.
I suspect that his father has been talking with him about these things, and maybe these are also subjects he heard about when listening to punk music (he told me he does), or browsing Youtube.

Future projects To me this has been pretty insightful. I might share some more internet drawings by adults in the future, which I think are also really interesting, as they show very different things depending on the age of the person. I ve been using the information gathered to work on a children s book which I hope to be able to share with you next year.

27 December 2023

David Bremner: Added a derived backend for org export

See web-stacker for the background. yantar92 on #org-mode pointed out that a derived backend would be a cleaner solution. I had initially thought it was too complicated, but I have to agree the example in the org-mode documentation does pretty much what I need. This new approach has the big advantage that the generation of URLs happens at export time, so it's not possible for the displayed program code and the version encoded in the URL to get out of sync.
;; derived backend to customize src block handling
(defun my-beamer-src-block (src-block contents info)
  "Transcode a SRC-BLOCK element from Org to beamer
         CONTENTS is nil.  INFO is a plist used as a communication
         channel."
  (let ((attr (org-export-read-attribute :attr_latex src-block :stacker)))
    (concat
     (when (or (not attr) (string= attr "both"))
       (org-export-with-backend 'beamer src-block contents info))
     (when attr
       (let* ((body  (org-element-property :value src-block))
              (table '(? ?\n ?: ?/ ?? ?# ?[ ?] ?@ ?! ?$ ?& ??
                         ?( ?) ?* ?+ ?, ?= ?%))
              (slug (org-link-encode body table))
              (simplified (replace-regexp-in-string "[%]20" "+" slug nil 'literal)))
         (format "\n\\stackerlink %s " simplified))))))
(defun my-beamer-export-to-latex
    (&optional async subtreep visible-only body-only ext-plist)
  "Export current buffer as a (my)Beamer presentation (tex).
    See org-beamer-export-to-latex for full docs"
  (interactive)
  (let ((file (org-export-output-file-name ".tex" subtreep)))
    (org-export-to-file 'my-beamer file
      async subtreep visible-only body-only ext-plist)))
(defun my-beamer-export-to-pdf
    (&optional async subtreep visible-only body-only ext-plist)
  "Export current buffer as a (my)Beamer presentation (PDF).
  See org-beamer-export-to-pdf for full docs."
  (interactive)
  (let ((file (org-export-output-file-name ".tex" subtreep)))
    (org-export-to-file 'my-beamer file
      async subtreep visible-only body-only ext-plist
      #'org-latex-compile)))
(with-eval-after-load "ox-beamer"
  (org-export-define-derived-backend 'my-beamer 'beamer
    :translate-alist '((src-block . my-beamer-src-block))
    :menu-entry '(?l 1 ((?m "my beamer .tex" my-beamer-export-to-latex)
                        (?M "my beamer .pdf" my-beamer-export-to-pdf)))))
An example of using this in an org-document would as below. The first source code block generates only a link in the output while the last adds a generated link to the normal highlighted source code.
* Stuff
** Frame
#+attr_latex: :stacker t
#+NAME: last
#+BEGIN_SRC stacker :eval no
  (f)
#+END_SRC
#+name: smol-example
#+BEGIN_SRC stacker :noweb yes
  (defvar x 1)
  (deffun (f)
    (let ([y 2])
      (deffun (h)
        (+ x y))
      (h)))
  <<last>>
#+END_SRC
** Another Frame 
#+ATTR_LATEX: :stacker both
#+begin_src smol :noweb yes
  <<smol-example>>
#+end_src

25 December 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Automatic Versioning Using semantic-release

This post describes how I m using semantic-release on gitlab-ci to manage versioning automatically for different kinds of projects following a simple workflow (a develop branch where changes are added or merged to test new versions, a temporary release/#.#.# to generate the release candidate versions and a main branch where the final versions are published).

What is semantic-releaseIt is a Node.js application designed to manage project versioning information on Git Repositories using a Continuous integration system (in this post we will use gitlab-ci)

How does it workBy default semantic-release uses semver for versioning (release versions use the format MAJOR.MINOR.PATCH) and commit messages are parsed to determine the next version number to publish. If after analyzing the commits the version number has to be changed, the command updates the files we tell it to (i.e. the package.json file for nodejs projects and possibly a CHANGELOG.md file), creates a new commit with the changed files, creates a tag with the new version and pushes the changes to the repository. When running on a CI/CD system we usually generate the artifacts related to a release (a package, a container image, etc.) from the tag, as it includes the right version number and usually has passed all the required tests (it is a good idea to run the tests again in any case, as someone could create a tag manually or we could run extra jobs when building the final assets if they fail it is not a big issue anyway, numbers are cheap and infinite, so we can skip releases if needed).

Commit messages and versioningThe commit messages must follow a known format, the default module used to analyze them uses the angular git commit guidelines, but I prefer the conventional commits one, mainly because it s a lot easier to use when you want to update the MAJOR version. The commit message format used must be:
<type>(optional scope): <description>
[optional body]
[optional footer(s)]
The system supports three types of branches: release, maintenance and pre-release, but for now I m not using maintenance ones. The branches I use and their types are:
  • main as release branch (final versions are published from there)
  • develop as pre release branch (used to publish development and testing versions with the format #.#.#-SNAPSHOT.#)
  • release/#.#.# as pre release branches (they are created from develop to publish release candidate versions with the format #.#.#-rc.# and once they are merged with main they are deleted)
On the release branch (main) the version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last version change (it looks for tags on the same branch).
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last version change.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.
On the pre release branches (develop and release/#.#.#) the version and pre release numbers are always calculated from the last published version available on the branch (i. e. if we published version 1.3.2 on main we need to have the commit with that tag on the develop or release/#.#.# branch to get right what will be the next version). The version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last released version.In our example it was 1.3.2 and the version is updated to 2.0.0-SNAPSHOT.1 or 2.0.0-rc.1 depending on the branch.
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last released version.In our example the release was 1.3.2 and the version is updated to 1.4.0-SNAPSHOT.1 or 1.4.0-rc.1 depending on the branch.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.In our example the release was 1.3.2 and the version is updated to 1.3.3-SNAPSHOT.1 or 1.3.3-rc.1 depending on the branch.
  4. The pre release number is incremented if the MAJOR, MINOR and PATCH numbers are not going to be changed but there is a commit that would otherwise update the version (i.e. a fix on 1.3.3-SNAPSHOT.1 will set the version to 1.3.3-SNAPSHOT.2, a fix or feat on 1.4.0-rc.1 will set the version to 1.4.0-rc.2 an so on).

How do we manage its configurationAlthough the system is designed to work with nodejs projects, it can be used with multiple programming languages and project types. For nodejs projects the usual place to put the configuration is the project s package.json, but I prefer to use the .releaserc file instead. As I use a common set of CI templates, instead of using a .releaserc on each project I generate it on the fly on the jobs that need it, replacing values related to the project type and the current branch on a template using the tmpl command (lately I use a branch of my own fork while I wait for some feedback from upstream, as you will see on the Dockerfile).

Container used to run itAs we run the command on a gitlab-ci job we use the image built from the following Dockerfile:
Dockerfile
# Semantic release image
FROM golang:alpine AS tmpl-builder
#RUN go install github.com/krakozaure/tmpl@v0.4.0
RUN go install github.com/sto/tmpl@v0.4.0-sto.2
FROM node:lts-alpine
COPY --from=tmpl-builder /go/bin/tmpl /usr/local/bin/tmpl
RUN apk update &&\
  apk upgrade &&\
  apk add curl git jq openssh-keygen yq zip &&\
  npm install --location=global\
    conventional-changelog-conventionalcommits@6.1.0\
    @qiwi/multi-semantic-release@7.0.0\
    semantic-release@21.0.7\
    @semantic-release/changelog@6.0.3\
    semantic-release-export-data@1.0.1\
    @semantic-release/git@10.0.1\
    @semantic-release/gitlab@9.5.1\
    @semantic-release/release-notes-generator@11.0.4\
    semantic-release-replace-plugin@1.2.7\
    semver@7.5.4\
  &&\
  rm -rf /var/cache/apk/*
CMD ["/bin/sh"]

How and when is it executedThe job that runs semantic-release is executed when new commits are added to the develop, release/#.#.# or main branches (basically when something is merged or pushed) and after all tests have passed (we don t want to create a new version that does not compile or passes at least the unit tests). The job is something like the following:
semantic_release:
  image: $SEMANTIC_RELEASE_IMAGE
  rules:
    - if: '$CI_COMMIT_BRANCH =~ /^(develop main release\/\d+.\d+.\d+)$/'
      when: always
  stage: release
  before_script:
    - echo "Loading scripts.sh"
    - . $ASSETS_DIR/scripts.sh
  script:
    - sr_gen_releaserc_json
    - git_push_setup
    - semantic-release
Where the SEMANTIC_RELEASE_IMAGE variable contains the URI of the image built using the Dockerfile above and the sr_gen_releaserc_json and git_push_setup are functions defined on the $ASSETS_DIR/scripts.sh file:
  • The sr_gen_releaserc_json function generates the .releaserc.json file using the tmpl command.
  • The git_push_setup function configures git to allow pushing changes to the repository with the semantic-release command, optionally signing them with a SSH key.

The sr_gen_releaserc_json functionThe code for the sr_gen_releaserc_json function is the following:
sr_gen_releaserc_json()
 
  # Use nodejs as default project_type
  project_type="$ PROJECT_TYPE:-nodejs "
  # REGEX to match the rc_branch name
  rc_branch_regex='^release\/[0-9]\+\.[0-9]\+\.[0-9]\+$'
  # PATHS on the local ASSETS_DIR
  assets_dir="$ CI_PROJECT_DIR /$ ASSETS_DIR "
  sr_local_plugin="$ assets_dir /local-plugin.cjs"
  releaserc_tmpl="$ assets_dir /releaserc.json.tmpl"
  pipeline_runtime_values_yaml="/tmp/releaserc_values.yaml"
  pipeline_values_yaml="$ assets_dir /values_$ project_type _project.yaml"
  # Destination PATH
  releaserc_json=".releaserc.json"
  # Create an empty pipeline_values_yaml if missing
  test -f "$pipeline_values_yaml"   : >"$pipeline_values_yaml"
  # Create the pipeline_runtime_values_yaml file
  echo "branch: $ CI_COMMIT_BRANCH " >"$pipeline_runtime_values_yaml"
  echo "gitlab_url: $ CI_SERVER_URL " >"$pipeline_runtime_values_yaml"
  # Add the rc_branch name if we are on an rc_branch
  if [ "$(echo "$CI_COMMIT_BRANCH"   sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_COMMIT_BRANCH " >>"$pipeline_runtime_values_yaml"
  elif [ "$(echo "$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"  
      sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_MERGE_REQUEST_SOURCE_BRANCH_NAME " \
      >>"$pipeline_runtime_values_yaml"
  fi
  echo "sr_local_plugin: $ sr_local_plugin " >>"$pipeline_runtime_values_yaml"
  # Create the releaserc_json file
  tmpl -f "$pipeline_runtime_values_yaml" -f "$pipeline_values_yaml" \
    "$releaserc_tmpl"   jq . >"$releaserc_json"
  # Remove the pipeline_runtime_values_yaml file
  rm -f "$pipeline_runtime_values_yaml"
  # Print the releaserc_json file
  print_file_collapsed "$releaserc_json"
  # --*-- BEG: NOTE --*--
  # Rename the package.json to ignore it when calling semantic release.
  # The idea is that the local-plugin renames it back on the first step of the
  # semantic-release process.
  # --*-- END: NOTE --*--
  if [ -f "package.json" ]; then
    echo "Renaming 'package.json' to 'package.json_disabled'"
    mv "package.json" "package.json_disabled"
  fi
 
Almost all the variables used on the function are defined by gitlab except the ASSETS_DIR and PROJECT_TYPE; in the complete pipelines the ASSETS_DIR is defined on a common file included by all the pipelines and the project type is defined on the .gitlab-ci.yml file of each project. If you review the code you will see that the file processed by the tmpl command is named releaserc.json.tmpl, its contents are shown here:
 
  "plugins": [
     - if .sr_local_plugin  
    "  .sr_local_plugin  ",
     - end  
    [
      "@semantic-release/commit-analyzer",
       
        "preset": "conventionalcommits",
        "releaseRules": [
            "breaking": true, "release": "major"  ,
            "revert": true, "release": "patch"  ,
            "type": "feat", "release": "minor"  ,
            "type": "fix", "release": "patch"  ,
            "type": "perf", "release": "patch"  
        ]
       
    ],
     - if .replacements  
    [
      "semantic-release-replace-plugin",
        "replacements":   .replacements   toJson    
    ],
     - end  
    "@semantic-release/release-notes-generator",
     - if eq .branch "main"  
    [
      "@semantic-release/changelog",
        "changelogFile": "CHANGELOG.md", "changelogTitle": "# Changelog"  
    ],
     - end  
    [
      "@semantic-release/git",
       
        "assets":   if .assets   .assets   toJson   else  []  end  ,
        "message": "ci(release): v$ nextRelease.version \n\n$ nextRelease.notes "
       
    ],
    [
      "@semantic-release/gitlab",
        "gitlabUrl": "  .gitlab_url  ", "successComment": false  
    ]
  ],
  "branches": [
      "name": "develop", "prerelease": "SNAPSHOT"  ,
     - if .rc_branch  
      "name": "  .rc_branch  ", "prerelease": "rc"  ,
     - end  
    "main"
  ]
 
The values used to process the template are defined on a file built on the fly (releaserc_values.yaml) that includes the following keys and values:
  • branch: the name of the current branch
  • gitlab_url: the URL of the gitlab server (the value is taken from the CI_SERVER_URL variable)
  • rc_branch: the name of the current rc branch; we only set the value if we are processing one because semantic-release only allows one branch to match the rc prefix and if we use a wildcard (i.e. release/*) but the users keep more than one release/#.#.# branch open at the same time the calls to semantic-release will fail for sure.
  • sr_local_plugin: the path to the local plugin we use (shown later)
The template also uses a values_$ project_type _project.yaml file that includes settings specific to the project type, the one for nodejs is as follows:
replacements:
  - files:
      - "package.json"
    from: "\"version\": \".*\""
    to: "\"version\": \"$ nextRelease.version \""
assets:
  - "CHANGELOG.md"
  - "package.json"
The replacements section is used to update the version field on the relevant files of the project (in our case the package.json file) and the assets section includes the files that will be committed to the repository when the release is published (looking at the template you can see that the CHANGELOG.md is only updated for the main branch, we do it this way because if we update the file on other branches it creates a merge nightmare and we are only interested on it for released versions anyway). The local plugin adds code to rename the package.json_disabled file to package.json if present and prints the last and next versions on the logs with a format that can be easily parsed using sed:
local-plugin.cjs
// Minimal plugin to:
// - rename the package.json_disabled file to package.json if present
// - log the semantic-release last & next versions
function verifyConditions(pluginConfig, context)  
  var fs = require('fs');
  if (fs.existsSync('package.json_disabled'))  
    fs.renameSync('package.json_disabled', 'package.json');
    context.logger.log( verifyConditions: renamed 'package.json_disabled' to 'package.json' );
   
 
function analyzeCommits(pluginConfig, context)  
  if (context.lastRelease && context.lastRelease.version)  
    context.logger.log( analyzeCommits: LAST_VERSION=$ context.lastRelease.version  );
   
 
function verifyRelease(pluginConfig, context)  
  if (context.nextRelease && context.nextRelease.version)  
    context.logger.log( verifyRelease: NEXT_VERSION=$ context.nextRelease.version  );
   
 
module.exports =  
  verifyConditions,
  analyzeCommits,
  verifyRelease
 

The git_push_setup functionThe code for the git_push_setup function is the following:
git_push_setup()
 
  # Update global credentials to allow git clone & push for all the group repos
  git config --global credential.helper store
  cat >"$HOME/.git-credentials" <<EOF
https://fake-user:$ GITLAB_REPOSITORY_TOKEN @gitlab.com
EOF
  # Define user name, mail and signing key for semantic-release
  user_name="$SR_USER_NAME"
  user_email="$SR_USER_EMAIL"
  ssh_signing_key="$SSH_SIGNING_KEY"
  # Export git user variables
  export GIT_AUTHOR_NAME="$user_name"
  export GIT_AUTHOR_EMAIL="$user_email"
  export GIT_COMMITTER_NAME="$user_name"
  export GIT_COMMITTER_EMAIL="$user_email"
  # Sign commits with ssh if there is a SSH_SIGNING_KEY variable
  if [ "$ssh_signing_key" ]; then
    echo "Configuring GIT to sign commits with SSH"
    ssh_keyfile="/tmp/.ssh-id"
    : >"$ssh_keyfile"
    chmod 0400 "$ssh_keyfile"
    echo "$ssh_signing_key"   tr -d '\r' >"$ssh_keyfile"
    git config gpg.format ssh
    git config user.signingkey "$ssh_keyfile"
    git config commit.gpgsign true
  fi
 
The function assumes that the GITLAB_REPOSITORY_TOKEN variable (set on the CI/CD variables section of the project or group we want) contains a token with read_repository and write_repository permissions on all the projects we are going to use this function. The SR_USER_NAME and SR_USER_EMAIL variables can be defined on a common file or the CI/CD variables section of the project or group we want to work with and the script assumes that the optional SSH_SIGNING_KEY is exported as a CI/CD default value of type variable (that is why the keyfile is created on the fly) and git is configured to use it if the variable is not empty.
Warning: Keep in mind that the variables GITLAB_REPOSITORY_TOKEN and SSH_SIGNING_KEY contain secrets, so probably is a good idea to make them protected (if you do that you have to make the develop, main and release/* branches protected too).
Warning: The semantic-release user has to be able to push to all the projects on those protected branches, it is a good idea to create a dedicated user and add it as a MAINTAINER for the projects we want (the MAINTAINERS need to be able to push to the branches), or, if you are using a Gitlab with a Premium license you can use the api to allow the semantic-release user to push to the protected branches without allowing it for any other user.

The semantic-release commandOnce we have the .releaserc file and the git configuration ready we run the semantic-release command. If the branch we are working with has one or more commits that will increment the version, the tool does the following (note that the steps are described are the ones executed if we use the configuration we have generated):
  1. It detects the commits that will increment the version and calculates the next version number.
  2. Generates the release notes for the version.
  3. Applies the replacements defined on the configuration (in our example updates the version field on the package.json file).
  4. Updates the CHANGELOG.md file adding the release notes if we are going to publish the file (when we are on the main branch).
  5. Creates a commit if all or some of the files listed on the assets key have changed and uses the commit message we have defined, replacing the variables for their current values.
  6. Creates a tag with the new version number and the release notes.
  7. As we are using the gitlab plugin after tagging it also creates a release on the project with the tag name and the release notes.

Notes about the git workflows and merges between branchesIt is very important to remember that semantic-release looks at the commits of a given branch when calculating the next version to publish, that has two important implications:
  1. On pre release branches we need to have the commit that includes the tag with the released version, if we don t have it the next version is not calculated correctly.
  2. It is a bad idea to squash commits when merging a branch to another one, if we do that we will lose the information semantic-release needs to calculate the next version and even if we use the right prefix for the squashed commit (fix, feat, ) we miss all the messages that would otherwise go to the CHANGELOG.md file.
To make sure that we have the right commits on the pre release branches we should merge the main branch changes into the develop one after each release tag is created; in my pipelines the fist job that processes a release tag creates a branch from the tag and an MR to merge it to develop. The important thing about that MR is that is must not be squashed, if we do that the tag commit will probably be lost, so we need to be careful. To merge the changes directly we can run the following code:
# Set the SR_TAG variable to the tag you want to process
SR_TAG="v1.3.2"
# Fetch all the changes
git fetch --all --prune
# Switch to the main branch
git switch main
# Pull all the changes
git pull
# Switch to the development branch
git switch develop
# Pull all the changes
git pull
# Create followup branch from tag
git switch -c "followup/$SR_TAG" "$SR_TAG"
# Change files manually & commit the changed files
git commit -a --untracked-files=no -m "ci(followup): $SR_TAG to develop"
# Switch to the development branch
git switch develop
# Merge the followup branch into the development one using the --no-ff option
git merge --no-ff "followup/$SR_TAG"
# Remove the followup branch
git branch -d "followup/$SR_TAG"
# Push the changes
git push
If we can t push directly to develop we can create a MR pushing the followup branch after committing the changes, but we have to make sure that we don t squash the commits when merging or it will not work as we want.

24 December 2023

Russ Allbery: Review: Liberty's Daughter

Review: Liberty's Daughter, by Naomi Kritzer
Publisher: Fairwood Press
Copyright: November 2023
ISBN: 1-958880-16-7
Format: Kindle
Pages: 257
Liberty's Daughter is a stand-alone near-future science fiction fix-up novel. The original stories were published in Fantasy and Science Fiction between 2012 and 2015. Beck Garrison lives on New Minerva (Min), one of a cluster of libertarian seasteads 220 nautical miles off the coast of Los Angeles. Her father brought her to Min when she was four, so it's the only life she knows. As this story opens, she's picked up a job for pocket change: finding very specific items that people want to buy. Since any new goods have to be shipped in and the seasteads have an ambiguous legal status, they don't get Amazon deliveries, but there are enough people (and enough tourists who bring high-value goods for trade) that someone probably has whatever someone else is looking for. Even sparkly high-heeled sandals size eight. Beck's father is high in the informal power structure of the seasteads for reasons that don't become apparent until very late in this book. Beck therefore has a comfortable, albeit cramped, life. The social protections, self-confidence, and feelings of invincibility that come with that wealth serve her well as a finder. After the current owner of the sandals bargains with her to find a person rather than an object, that privilege also lets her learn quite a lot before she starts getting into trouble. The political background of this novel is going to require some suspension of disbelief. The premise is that one of those harebrained libertarian schemes to form a freedom utopia has been successful enough to last for 49 years and attract 80,000 permanent residents. (It's a libertarian seastead so a lot of those residents are indentured slaves, as one does in libertarian philosophy. The number of people with shares, like Beck's father, is considerably smaller.) By the end of the book, Kritzer has offered some explanations for why the US would allow such a place to continue to exist, but the chances of the famously fractious con artists and incompetents involved in these types of endeavors creating something that survived internal power struggles for that long seem low. One has to roll with it for story reasons: Kritzer needs the population to be large enough for a plot, and the history to be long enough for Beck to exist as a character. The strength of this book is Beck, and specifically the fact that Beck is a second-generation teenager who grew up on the seastead. Unlike a lot of her age peers with their Cayman Islands vacations, she's never left and has no experience with life on land. She considers many things to be perfectly normal that are not at all normal to the reader and the various reader surrogates who show up over the course of the book. She also has the instinctive feel for seastead politics of the child of a prominent figure in a small town. And, most importantly, she has formed her own sense of morality and social structure that matches neither that of the reader nor that of her father. Liberty's Daughter is told in first-person by Beck. Judging the authenticity of Gen-Z thought processes is not one of my strengths, but Beck felt right to me. Her narration is dryly matter-of-fact, with only brief descriptions of her emotional reactions, but her personality shines in the occasional sarcasm and obstinacy. Kritzer has the teenage bafflement at the stupidity of adults down pat, as well as the tendency to jump head-first into ideas and make some decisions through sheer stubbornness. This is not one of those fix-up novels where the author has reworked the stories sufficiently that the original seams don't show. It is very episodic; compared to a typical novel of this length, there's more plot but less character growth. It's a good book when you want to be pulled into a stream of events that moves right along. This is not the book for deep philosophical examinations of the basis of a moral society, but it does have, around the edges, is the humans build human societies and develop elaborate social conventions and senses of belonging no matter how stupid the original philosophical foundations were. Even societies built on nasty exploitation can engender a sort of loyalty. Beck doesn't support the worst parts of her weird society, but she wants to fix it, not burn it to the ground. I thought there was a profound observation there. That brings me to my complaint: I hated the ending. Liberty's Daughter is in part Beck's fight for her own autonomy, both moral and financial, and the beginnings of an effort to turn her home into the sort of home she wants. By the end of the book, she's testing the limits of what she can accomplish, solidifying her own moral compass, and deciding how she wants to use the social position she inherited. It felt like the ending undermined all of that and treated her like a child. I know adolescence comes with those sorts of reversals, but I was still so mad. This is particularly annoying since I otherwise want to recommend this book. It's not ground-breaking, it's not that deep, but it was a thoroughly enjoyable day's worth of entertainment with a likable protagonist. Just don't read the last chapter, I guess? Or have more tolerance than I have for people treating sixteen-year-olds as if they're not old enough to make decisions. Content warnings: pandemic. Rating: 7 out of 10

22 December 2023

Joachim Breitner: The Haskell Interlude Podcast

It was pointed out to me that I have not blogged about this, so better now than never: Since 2021 I am together with four other hosts producing a regular podcast about Haskell, the Haskell Interlude. Roughly every two weeks two of us interview someone from the Haskell Community, and we chat for approximately an hour about how they came to Haskell, what they are doing with it, why they are doing it and what else is on their mind. Sometimes we talk to very famous people, like Simon Peyton Jones, and sometimes to people who maybe should be famous, but aren t quite yet. For most episodes we also have a transcript, so you can read the interviews instead, if you prefer, and you should find the podcast on most podcast apps as well. I do not know how reliable these statistics are, but supposedly we regularly have around 1300 listeners. We don t get much feedback, however, so if you like the show, or dislike it, or have feedback, let us know (for example on the Haskell Disourse, which has a thread for each episode). At the time of writing, we released 40 episodes. For the benefit of my (likely hypothetical) fans, or those who want to train an AI voice model for nefarious purposes, here is the list of episodes co-hosted by me: Can t decide where to start? The one with Ryan Trinkle might be my favorite. Thanks to the Haskell Foundation and its sponsors for supporting this podcast (hosting, editing, transscription).

20 December 2023

Ulrike Uhlig: How volunteer work in F/LOSS exacerbates pre-existing lines of oppression, and what that has to do with low diversity

This is a post I wrote in June 2022, but did not publish back then. After first publishing it in December 2023, a perfectionist insecure part of me unpublished it again. After receiving positive feedback, i slightly amended and republish it now. In this post, I talk about unpaid work in F/LOSS, taking on the example of hackathons, and why, in my opinion, the expectation of volunteer work is hurting diversity. Disclaimer: I don t have all the answers, only some ideas and questions.

Previous findings In 2006, the Flosspols survey searched to explain the role of gender in free/libre/open source software (F/LOSS) communities because an earlier [study] revealed a significant discrepancy in the proportion of men to women. It showed that just about 1.5% of F/LOSS community members were female at that time, compared with 28% in proprietary software (which is also a low number). Their key findings were, to name just a few:
  • that F/LOSS rewards the producing code rather than the producing software. It thereby puts most emphasis on a particular skill set. Other activities such as interface design or documentation are understood as less technical and therefore less prestigious.
  • The reliance on long hours of intensive computing in writing successful code means that men, who in general assume that time outside of waged labour is theirs , are freer to participate than women, who normally still assume a disproportionate amount of domestic responsibilities. Female F/LOSS participants, however, seem to be able to allocate a disproportionate larger share of their leisure time for their F/LOSS activities. This gives an indication that women who are not able to spend as much time on voluntary activities have difficulties to integrate into the community.
We also know from the 2016 Debian survey, published in 2021, that a majority of Debian contributors are employed, rather than being contractors, and rather than being students. Also, 95.5% of respondents to that study were men between the ages of 30 and 49, highly educated, with the largest groups coming from Germany, France, USA, and the UK. The study found that only 20% of the respondents were being paid to work on Debian. Half of these 20% estimate that the amount of work on Debian they are being paid for corresponds to less than 20% of the work they do there. On the other side, there are 14% of those who are being paid for Debian work who declared that 80-100% of the work they do in Debian is remunerated.

So, if a majority of people is not paid, why do they work on F/LOSS? Or: What are the incentives of free software? In 2021, Louis-Philippe V ronneau aka Pollo, who is not only a Debian Developer but also an economist, published his thesis What are the incentive structures of free software (The actual thesis was written in French). One very interesting finding Pollo pointed out is this one:
Indeed, while we have proven that there is a strong and significative correlation between the income and the participation in a free/libre software project, it is not possible for us to pronounce ourselves about the causality of this link.
In the French original text:
En effet, si nous avons prouv qu il existe une corr lation forte et significative entre le salaire et la participation un projet libre, il ne nous est pas possible de nous prononcer sur la causalit de ce lien.
Said differently, it is certain that there is a relationship between income and F/LOSS contribution, but it s unclear whether working on free/libre software ultimately helps finding a well paid job, or if having a well paid job is the cause enabling work on free/libre software. I would like to scratch this question a bit further, mostly relying on my own observations, experiences, and discussions with F/LOSS contributors.

Volunteer work is unpaid work We often hear of hackathons, hack weeks, or hackfests. I ve been at some such events myself, Tails organized one, the IETF regularly organizes hackathons, and last week (June 2022!) I saw an invitation for a hack week with the Torproject. This type of event generally last several days. While the people who organize these events are being paid by the organizations they work for, participants on the other hand are generally joining on a volunteer basis. Who can we expect to show up at this type of event under these circumstances as participants? To answer this question, I collected some ideas:
  • people who have an employer sponsoring their work
  • people who have a funder/grant sponsoring their work
  • people who have a high income and can take time off easily (in that regard, remember the Gender Pay Gap, women often earn less for the same work than men)
  • people who rely on family wealth (living off an inheritance, living on rights payments from a famous grandparent - I m not making these situations up, there are actual people in such financially favorable situations )
  • people who don t need much money because they don t have to pay rent or pay low rent (besides house owners that category includes people who live in squats or have social welfare paying for their rent, people who live with parents or caretakers)
  • people who don t need to do care work (for children, elderly family members, pets. Remember that most care work is still done by women.)
  • students who have financial support or are in a situation in which they do not yet need to generate a lot of income
  • people who otherwise have free time at their disposal
So, who, in your opinion, fits these unwritten requirements? Looking at this list, it s pretty clear to me why we d mostly find white men from the Global North, generally with higher education in hackathons and F/LOSS development. ( Great, they re a culture fit! ) Yes, there will also always be some people of marginalized groups who will attend such events because they expect to network, to find an internship, to find a better job in the future, or to add their participation to their curriculum. To me, this rings a bunch of alarm bells.

Low diversity in F/LOSS projects a mirror of the distribution of wealth I believe that the lack of diversity in F/LOSS is first of all a mirror of the distribution of wealth on a larger level. And by wealth I m referring to financial wealth as much as to social wealth in the sense of Bourdieu: Families of highly educated parents socially reproducing privilege by allowing their kids to attend better schools, supporting and guiding them in their choices of study and work, providing them with relations to internships acting as springboards into well paid jobs and so on. That said, we should ask ourselves as well:

Do F/LOSS projects exacerbate existing lines of oppression by relying on unpaid work? Let s look again at the causality question of Pollo s research (in my words):
It is unclear whether working on free/libre software ultimately helps finding a well paid job, or if having a well paid job is the cause enabling work on free/libre software.
Maybe we need to imagine this cause-effect relationship over time: as a student, without children and lots of free time, hopefully some money from the state or the family, people can spend time on F/LOSS, collect experience, earn recognition - and later find a well-paid job and make unpaid F/LOSS contributions into a hobby, cementing their status in the community, while at the same time generating a sense of well-being from working on the common good. This is a quite common scenario. As the Flosspols study revealed however, boys often get their own computer at the age of 14, while girls get one only at the age of 20. (These numbers might be slightly different now, and possibly many people don t own an actual laptop or desktop computer anymore, instead they own mobile devices which are not exactly inciting them to look behind the surface, take apart, learn, appropriate technology.) In any case, the above scenario does not allow for people who join F/LOSS later in life, eg. changing careers, to find their place. I believe that F/LOSS projects cannot expect to have more women, people of color, people from working class backgrounds, people from outside of Germany, France, USA, UK, Australia, and Canada on board as long as volunteer work is the status quo and waged labour an earned privilege.

Wait, are you criticizing all these wonderful people who sacrifice their free time to work towards common good? No, that s definitely not my intention, I m glad that F/LOSS exists, and the F/LOSS ecosystem has always represented a small utopia to me that is worth cherishing and nurturing. However, I think we still need to talk more about the lack of diversity, and investigate it further.

Some types of work are never being paid Besides free work at hacking events, let me also underline that a lot of work in F/LOSS is not considered payable work (yes, that s an oxymoron!). Which F/LOSS project for example, has ever paid translators a decent fee? Which project has ever considered that doing the social glue work, often done by women in the projects, is work that should be paid for? Which F/LOSS projects pay the people who do their Debian packaging rather than relying on yet another already well-paid white man who can afford doing this work for free all the while holding up how great the F/LOSS ecosystem is? And how many people on opensourcedesign jobs are looking to get their logo or website done for free? (Isn t that heart icon appealing to your altruistic empathy?) In my experience even F/LOSS projects which are trying to do the right thing by paying everyone the same amount of money per hour run into issues when it turns out that not all hours are equal and that some types of work do not qualify for remuneration at all or that the rules for the clocking of work are not universally applied in the same way by everyone.

Not every interaction should have a monetary value, but Some of you want to keep working without being paid, because that feels a bit like communism within capitalism, it makes you feel good to contribute to the greater good while not having the system determine your value over money. I hear you. I ve been there (and sometimes still am). But as long as we live in this system, even though we didn t choose to and maybe even despise it - communism is not about working for free, it s about getting paid equally and adequately. We may not think about it while under the age of 40 or 45, but working without adequate financial compensation, even half of the time, will ultimately result in not being able to care for oneself when sick, when old. And while this may not be an issue for people who inherit wealth, or have an otherwise safe economical background, eg. an academic salary, it is a huge problem and barrier for many people coming out of the working or service classes. (Oh and please, don t repeat the neoliberal lie that everyone can achieve whatever they aim for, if they just tried hard enough. French research shows that (in France) one has only 30% chance to become a class defector , and change social class upwards. But I managed to get out and move up, so everyone can! - well, if you believe that I m afraid you might be experiencing survivor bias.)

Not all bodies are equally able We should also be aware that not all of us can work with the same amount of energy either. There is yet another category of people who are excluded by the expectation of volunteer work, either because the waged labour they do already eats all of their energy, or because their bodies are not disposed to do that much work, for example because of mental health issues - such as depression-, or because of physical disabilities.

When organizing events relying on volunteer work please think about these things. Yes, you can tell people that they should ask their employer to pay them for attending a hackathon - but, as I ve hopefully shown, that would not do it for many people, especially newcomers. Instead, you could propose a fund to make it possible that people who would not normally attend can attend. DebConf is a good example for having done this for many years.

Conclusively I would like to urge free software projects that have a budget and directly pay some people from it to map where they rely on volunteer work and how this hurts diversity in their project. How do you or your project exacerbate pre-existing lines of oppression by granting or not granting monetary value to certain types of work? What is it that you take for granted? As always, I m curious about your feedback!

Worth a read These ideas are far from being new. Ashe Dryden s well-researched post The ethics of unpaid labor and the OSS community dates back to 2013 and is as important as it was ten years ago.

Next.

Previous.